Upload
dangthuy
View
216
Download
1
Embed Size (px)
Citation preview
Editorrsquos Note The following article was the JBES Invited Address presented at the Joint StatisticalMeetings Minneapolis Minnesota August 7ndash11 2005
Realized Variance and MarketMicrostructure NoisePeter R HANSEN
Department of Economics Stanford University 579 Serra Mall Stanford CA 94305-6072(peterhansenstanfordedu)
Asger LUNDEDepartment of Marketing and Statistics Aarhus School of Business Fuglesangs Alle 4 8210 Aarhus V Denmark
We study market microstructure noise in high-frequency data and analyze its implications for the real-ized variance (RV) under a general specification for the noise We show that kernel-based estimators canunearth important characteristics of market microstructure noise and that a simple kernel-based estimatordominates the RV for the estimation of integrated variance (IV) An empirical analysis of the Dow JonesIndustrial Average stocks reveals that market microstructure noise is time-dependent and correlated withincrements in the efficient price This has important implications for volatility estimation based on high-frequency data Finally we apply cointegration techniques to decompose transaction prices and bidndashaskquotes into an estimate of the efficient price and noise This framework enables us to study the dynamiceffects on transaction prices and quotes caused by changes in the efficient price
KEY WORDS Bias correction High-frequency data Integrated variance Market microstructure noiseRealized variance Realized volatility Sampling schemes
The great tragedy of Sciencemdashthe slaying of a beautiful hypothesis by an uglyfact (Thomas H Huxley 1825ndash1895)
1 INTRODUCTION
The presence of market microstructure noise in high-frequen-cy financial data complicates the estimation of financial volatil-ity and makes standard estimators such as the realizedvariance (RV) unreliable Thus from the perspective of volatil-ity estimation market microstructure noise is an ldquougly factrdquo thatchallenges the validity of theoretical results that rely on the ab-sence of noise Volatility estimation in the presence of marketmicrostructure noise is currently a very active area of researchInterestingly this literature was initiated by an article by Zhou(1996) that was published in this journal a decade ago and wasin many ways 10 years ahead of its time
The best remedy for market microstructure noise depends onthe properties of the noise and the main purpose of this arti-cle is to unearth the empirical properties of market microstruc-ture noise We use a number of kernel-based estimators that arewell suited for this problem and our empirical analysis of high-frequency stock returns reveals the following ugly facts aboutmarket microstructure noise
1 The noise is correlated with the efficient price2 The noise is time-dependent3 The noise is quite small in the Dow Jones Industrial Av-
erage (DJIA) stocks4 The properties of the noise have changed substantially
over time
These four empirical ldquofactsrdquo are related to one another andhave important implications for volatility estimation The timedependence in the noise and the correlation between noise andefficient price arise naturally in some models on market mi-
crostructure effects including (a generalized version of ) thebidndashask model by Roll (1984) (see Hasbrouck 2004 for a dis-cussion) and models where agents have asymmetric informa-tion such as those by Glosten and Milgrom (1985) and Easleyand OrsquoHara (1987 1992) Market microstructure noise hasmany sources including the discreteness of the data (see Harris1990 1991) and properties of the trading mechanism (see egBlack 1976 Amihud and Mendelson 1987) (For additional ref-erences to this literature see eg OrsquoHara 1995 Hasbrouck2004)
The main contributions of this article are as follows Firstwe characterize how the RV is affected by market microstruc-ture noise under a general specification for the noise that allowsfor various forms of stochastic dependencies Second we showthat market microstructure noise is time-dependent and corre-lated with efficient returns Third we consider some existingtheoretical results based on assumptions about the noise that aretoo simplistic and discuss when such results provide reason-able approximations For example our empirical analysis of the30 DJIA stocks shows that the noise may be ignored when in-traday returns are sampled at relatively low frequencies such as20-minute sampling Assuming the noise is of an independenttype seems to be reasonable when intraday returns are sampledevery 15 ticks or so Fourth we apply cointegration methodsto decompose transaction prices and bidask quotations into es-timates of the efficient price and market microstructure noiseThe correlations between these estimated series are consistentwith the volatility signature plots The cointegration analysisenables us to study how a change in the efficient price dynami-cally affects bid ask and transaction prices
copy 2006 American Statistical AssociationJournal of Business amp Economic Statistics
April 2006 Vol 24 No 2DOI 101198073500106000000071
127
128 Journal of Business amp Economic Statistics April 2006
The interest for empirical quantities based on high-frequencydata has surged in recent years (see Barndorff-Nielsen andShephard 2007 for a recent survey) The RV is a well-knownquantity that goes back to Merton (1980) Other empiricalquantities include bipower variation and multipower varia-tion which are particularly useful for detecting jumps (seeBarndorff-Nielsen and Shephard 2003 2004 2006ab Ander-sen Bollerslev and Diebold 2003 Bollerslev Kretschmer Pig-orsch and Tauchen 2005 Huang and Tauchen 2005 Tauchenand Zhou 2004) and intraday range-based estimators (seeChristensen and Podolskij 2005) High-frequencyndashbased quan-tities have proven useful for a number of problems For ex-ample several authors have applied filtering and smoothingtechniques to time series of the RV to obtain time seriesfor daily volatility (see eg Maheu and McCurdy 2002Barndorff-Nielsen Nielsen Shephard and Ysusi 1996 Engleand Sun 2005 Frijns and Lehnert 2004 Koopman Jungbackerand Hol 2005 Hansen and Lunde 2005b Owens and Steiger-wald 2005) High-frequencyndashbased quantities are also use-ful in the context of forecasting (see Andersen Bollerslevand Meddahi 2004 Ghysels Santa-Clara and Valkanov 2006)and the evaluation and comparison of volatility models (seeAndersen and Bollerslev 1998 Hansen Lunde and Nason2003 Hansen and Lunde 2005a 2006 Patton 2005)
The RV which is a sum of squared intraday returns yields aperfect estimate of volatility in the ideal situation where pricesare observed continuously and without measurement error (seeeg Merton 1980) This result suggests that the RV should bebased on intraday returns sampled at the highest possible fre-quency (tick-by-tick data) Unfortunately the RV suffers froma well-known bias problem that tends to get worse as the sam-pling frequency of intraday returns increases (see eg Fang1996 Andreou and Ghysels 2002 Oomen 2002 Bai Russelland Tiao 2004) The source of this bias problem is known asmarket microstructure noise and the bias is particularly ev-ident in volatility signature plots (see Andersen BollerslevDiebold and Labys 2000b) Thus there is a trade-off betweenbias and variance when choosing the sampling frequency asdiscussed by Bandi and Russell (2005) and Zhang Myklandand Aiumlt-Sahalia (2005) This trade-off is the reason that theRV is often computed from intraday returns sampled at a mod-erate frequency such as 5-minute or 20-minute sampling
A key insight into the problem of estimating the volatil-ity from high-frequency data comes from its similarity to theproblem of estimating the long-run variance of a stationarytime series In this literature it is well known that autocorre-lation necessitates modifications of the usual sum-of-squaredestimator Those modifications of Newey and West (1987) andAndrews (1991) provided such estimators that are robust to au-tocorrelation Market microstructure noise induces autocorre-lation in the intraday returns and this autocorrelation is thesource of the RVrsquos bias problem Given this connection to long-run variance estimation it is not surprising that ldquoprewhiten-ingrdquo of intraday returns and kernel-based estimators (includingthe closely related subsample-based estimators) are found tobe useful in the present context Zhou (1996) introduced theuse of kernel-based estimators and the subsampling idea todeal with market microstructure noise in high-frequency dataFiltering techniques have been used by Ebens (1999) Ander-sen Bollerslev Diebold and Ebens (2001) and Maheu and
McCurdy (2002) (moving average filter) and Bollen and In-der (2002) (autoregressive filter) Kernel-based estimators wereexplored by Zhou (1996) Hansen and Lunde (2003) andBarndorff-Nielsen Hansen Lunde and Shephard (2004) andthe closely related subsample-based estimators were used in anunpublished paper by Muumlller (1993) and also by Zhou (1996)Zhang et al (2005) and Zhang (2004)
The rest of the article is organized as follows In Section 2we describe our theoretical framework and discuss samplingschemes in calendar time and tick time We also characterizethe bias of the RV under a general specification for the noiseIn Section 3 we consider the case with independent market mi-crostructure noise which has been used by various authors in-cluding Corsi Zumbach Muumlller and Dacorogna (2001) Curciand Corsi (2004) Bandi and Russell (2005) and Zhang et al(2005) We consider a simple kernel-based estimator of Zhou(1996) that we denote by RVAC1 because it uses the first-orderautocorrelation to bias-correct the RV We benchmark RVAC1
to the standard measure of RV and find that the former is su-perior to the latter in terms of the mean squared error (MSE)We also evaluate the implications for some theoretical resultsbased on assumptions in which market microstructure noise isabsent Interestingly we find that the root mean squared er-ror (RMSE) of the RV in the presence of noise is quite simi-lar to those that ignore the noise at low sampling frequenciessuch as 20-minute sampling This finding is important becausemany existing empirical studies have drawn conclusions from20-minute and 30-minute intraday returns using the results ofBarndorff-Nielsen and Shephard (2002) However at 5-minutesampling we find that the ldquotruerdquo confidence interval about theRV can be as much as 100 larger than those based on an ldquoab-sence of noise assumptionrdquo In Section 4 we present a robustestimator that is unbiased for a general type of noise and dis-cuss noise that is time-dependent in both calendar time and ticktime We also discuss the subsampling version of Zhoursquos es-timator which is robust to some forms of time-dependence intick time In Section 5 we describe our data and present mostof our empirical results The key result is the overwhelming ev-idence against the independent noise assumption This findingis quite robust to the choice of sampling method (calendar timeor tick time) and the type of price data (transaction prices orquotation prices) This dependence structure has important im-plications for many quantities based on ultra-highndashfrequencydata These features of the noise have important implicationsfor some of the bias corrections that have been used in the liter-ature Although the independent noise assumption may be fairlyreasonable when the tick size is 116 it is clearly not consistentwith the recent data In fact much of the noise has ldquoevaporatedrdquoafter the tick size is reduced to 1 cent In Section 6 we presenta cointegration analysis of the vector of bid ask and transactionprices The Granger representation makes it possible to decom-pose each of the price series into noise and a common efficientprice Further based on this decomposition we estimate impulseresponse functions that reveal the dynamic effects on bid askand transaction prices as a response to a change in the efficientprice In Section 7 we provide a summary and we concludethe article with three appendixes that provide proofs and detailsabout our estimation methods
Hansen and Lunde Realized Variance and Market Microstructure Noise 129
2 THE THEORETICAL FRAMEWORK
We let plowast(t) denote a latent log-price process in continuoustime and use p(t) to denote the observable log-price processThus the noise process is given by
u(t)equiv p(t)minus plowast(t)
The noise process u may be due to market microstructure ef-fects such as bidndashask bounces but the discrepancy betweenp and plowast can also be induced by the technique used to con-struct p(t) For example p is often constructed artificially fromobserved transactions or quotes using the previous tick methodor the linear interpolation method which we define and discusslater in this section
We work under the following specification for the efficientprice process plowast
Assumption 1 The efficient price process satisfies dplowast(t) =σ(t)dw(t) where w(t) is a standard Brownian motion σ isa random function that is independent of w and σ 2(t) isLipschitz (almost surely)
In our analysis we condition on the volatility path σ 2(t)because our analysis focuses on estimators of the integratedvariance (IV)
IV equivint b
aσ 2(t)dt
Thus we can treat σ 2(t) as deterministic even though weview the volatility path as random The Lipschitz condition isa smoothness condition that requires |σ 2(t) minus σ 2(t + δ)| lt εδ
for some ε and all t and δ (with probability 1) The assump-tion that w and σ are independent is not essential The connec-tion between kernel-based and subsample-based estimators (seeBarndorff-Nielsen et al 2004) shows that weaker assumptionsused by Zhang et al (2005) and Zhang (2004) are sufficient inthis framework
We partition the interval [ab] into m subintervals andm plays a central role in our analysis For example we deriveasymptotic distributions of quantities as m rarrinfin This type ofinfill asymptotics is commonly used in spatial data analysis andgoes back to Stein (1987) Related to the present context is theuse of infill asymptotics for estimation of diffusions (see Bandiand Phillips 2004) For a fixed m the ith subinterval is givenby [timinus1m tim] where a = t0m lt t1m lt middot middot middot lt tmm = b Thelength of the ith subinterval is given by δim equiv tim minus timinus1m andwe assume that supi=1m δim = O( 1
m ) such that the lengthof each subinterval shrinks to 0 as m increases The intradayreturns are now defined by
ylowastim equiv plowast(tim)minus plowast(timinus1m) i = 1 m
and the increments in p and u are defined similarly and denotedby
yim equiv p(tim)minus p(timinus1m) i = 1 m
and
eim equiv u(tim)minus u(timinus1m) i = 1 m
Note that the observed intraday returns decompose into yim =ylowastim + eim The IV over each of the subintervals is definedby
σ 2im equiv
int tim
timinus1m
σ 2(s)ds i = 1 m
and we note that var(ylowastim) = E(ylowast2im) = σ 2
im under Assump-tion 1
The RV of plowast is defined by
RV(m)lowast equivmsum
i=1
ylowast2im
and RV(m)lowast is consistent for the IV as m rarrinfin (see eg Protter2005) A feasible asymptotic distribution theory of RV (in rela-tion to IV) was established by Barndorff-Nielsen and Shephard(2002) (see also Meddahi 2002 Mykland and Zhang 2006Gonccedilalves and Meddahi 2005) Whereas RV(m)lowast is an ideal es-timator it is not a feasible estimator because plowast is latent Therealized variance of p given by
RV(m) equivmsum
i=1
y2im
is observable but suffers from a well-known bias problem andis generally inconsistent for the IV (see eg Bandi and Russell2005 Zhang et al 2005)
21 Sampling Schemes
Intraday returns can be constructed using different types ofsampling schemes The special case where tim i = 1 mare equidistant in calendar time [ie δim = (bminus a)m for all i]is referred to as calendar time sampling (CTS) The widely usedexchange rates data from Olsen and associates (see Muumlller et al1990) are equidistant in time and 5-minute sampling (δim =5 min) is often used in practice
CTS requires the construction of artificial prices from theraw (irregularly spaced) price data (transaction prices or quo-tations) Given observed prices at the times t0 lt middot middot middot lt tN onecan construct a price at time τ isin [tj tj+1) using
p(τ )equiv ptj
or
p(τ )equiv ptj +τ minus tj
tj+1 minus tj
(ptj+1 minus ptj
)
The former is known as the previous tick method (Wasserfallenand Zimmermann 1985) and the latter is the linear interpola-tion method (see Andersen and Bollerslev 1997) Both methodshave been discussed by Dacorogna Gencay Muumlller Olsen andPictet (2001 sec 321) When sampling at ultra-high frequen-cies the linear interpolation method has the following unfortu-
nate property where ldquoprarrrdquo denotes convergence in probability
Lemma 1 Let N be fixed and consider the RV based onthe linear interpolation method It holds that RV(m) prarr 0 asm rarrinfin
130 Journal of Business amp Economic Statistics April 2006
The result of Lemma 1 essentially boils down to the fact thatthe quadratic variation of a straight line is zero Although thisis a limit result (as m rarrinfin) the lemma does suggest that thelinear interpolation method is not suitable for the constructionof intraday returns at high frequencies where sampling may oc-cur multiple times between two neighboring price observationsThat the result of Lemma 1 is more than a theoretical artifact isevident from the volatility signature plots of Hansen and Lunde(2003) Given the result of Lemma 1 we avoid the use of thelinear interpolation and use the previous tick method to con-struct CTS intraday returns
The case where tim denotes the time of a transactionquota-tion is referred to as tick time sampling (TTS) An example ofTTS is when tim i = 1 m are chosen to be the time ofevery fifth transaction say
The case where the sampling times t0m tmm are suchthat σ 2
im = IVm for all i = 1 m is known as businesstime sampling (BTS) (see Oomen 2006) Zhou (1998) referredto BTS intraday returns as de-volatized returns and discusseddistributional advantages of BTS returns Whereas tim i =0 m are observable under CTS and TTS they are latentunder BTS because the sampling times are defined from theunobserved volatility path Empirical results of Andersen andBollerslev (1997) and Curci and Corsi (2004) suggest that BTScan be approximated by TTS This feature is nicely capturedin the framework of Oomen (2006) where the (random) ticktimes are generated with an intensity directly related to a quan-tity corresponding to σ 2(t) in the present context Under CTSwe sometimes write RV(x sec) where x seconds is the periodin time spanned by each of the intraday returns (ie δim = xseconds) Similarly we write RV(y tick) under TTS when eachintraday return spans y ticks (transactions or quotations)
22 Characterizing the Bias of the Realized VarianceUnder General Noise
Initially we make the following assumptions about the noiseprocess u
Assumption 2 The noise process u is covariance stationarywith mean 0 such that its autocovariance function is defined byπ(s)equiv E[u(t)u(t + s)]
The covariance function π plays a key role because thebias of RV(m) is tied to the properties of π(s) in the neighbor-hood of 0 Simple examples of noise processes that satisfy As-sumption 2 include the independent noise process which hasπ(s)= 0 for all s = 0 and the OrnsteinndashUhlenbeck processThe latter was used by Aiumlt-Sahalia Mykland and Zhang(2005a) to study estimation in a parametric diffusion modelthat is robust to market microstructure noise
An important aspect of our analysis is that our assumptionsallow for a dependence between u and plowast This is a generaliza-tion of the assumptions made in the existing literature and ourempirical analysis shows that this generalization is needed inparticularly when prices are sampled from quotations
Next we characterize the RV bias under these general as-sumptions for the market microstructure noise u
Theorem 1 Given Assumptions 1 and 2 the bias of the real-ized variance under CTS is given by
E[RV(m) minus IV
]= 2ρm + 2m
[π(0)minus π
(bminus a
m
)] (1)
where ρm equiv E(summ
i=1 ylowastimeim)
The result of Theorem 1 is based on the following decompo-sition of the observed RV
RV(m) =msum
i=1
ylowast2im + 2
msumi=1
eimylowastim +msum
i=1
e2im
wheresumm
i=1 e2im is the ldquorealized variancerdquo of the noise process
u responsible for the last bias term in (1) The dependence be-tween u and plowast that is relevant for our analysis is given in theform of the correlation between the efficient intraday returnsylowastim and the return noise eim By the CauchyndashSchwarz in-equality π(0) ge π(s) for all s such that the bias is alwayspositive when the return noise process eim is uncorrelatedwith the efficient intraday returns ylowastim (because this implies thatρm = 0) Interestingly the total bias can be negative This oc-curs when ρm lt minusm[π(0)minus π(m)] which is the case wherethe downward bias (caused by a negative correlation betweeneim and ylowastim) exceeds the upward bias caused by the ldquorealizedvariancerdquo of u This appears to be the case for the RVs that arebased on quoted prices as shown in Figure 1
The last term of the bias expression in Theorem 1 shows thatthe bias is tied to the properties of π(s) in the neighborhoodof 0 and as m rarrinfin (hence δm rarr 0) we obtain the followingresult
Corollary 1 Suppose that the assumptions of Theorem 1hold and that π(s) is differentiable at 0 Then the asymptoticbias is given by
limmrarrinfinE
[RV(m) minus IV
]= 2ρ minus 2(bminus a)π prime(0)
provided that ρ equiv limmrarrinfin E(summ
i=1 ylowastimeim) is well defined
Under the independent noise assumption we can defineπ prime(0) = minusinfin which is the situation that we analyze in detailin Section 3 A related asymptotic result is obtained wheneverthe quadratic variation of the bivariate process (plowastu)prime is welldefined such that [pp] = [plowastplowast] + 2[plowastu] + [uu] where[XY] denotes the quadratic covariation In this setting we haveIV = [plowastplowast] such that
RV(m) minus IVprarr 2[plowastu] + [uu] (as m rarrinfin)
where ρ = [plowastu] and minus2(b minus a)π prime(0) = [uu] (almost surelyunder additional assumptions)
A volatility signature plot provides an easy way to visuallyinspect the potential bias problems of RV-type estimators Suchplots first appeared in an unpublished thesis by Fang (1996)and were named and made popular by Andersen et al (2000b)Let RV(m)
t denote the RV based on m intraday returns on day tA volatility signature plot displays the sample average
RV(m) equiv nminus1nsum
t=1
RV(m)t
Hansen and Lunde Realized Variance and Market Microstructure Noise 131
Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4
The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents
an approximate 95 confidence interval for the average volatility
132 Journal of Business amp Economic Statistics April 2006
as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)
Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)
t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)
ACNW30 defined in Section 42 The shaded area about
σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B
From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)
A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise
Fact I The noise is negatively correlated with the efficientreturns
Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa
m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm
is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From
cov(ylowastim emidim )= 1
2cov(ylowastim eask
im)+ 1
2cov(ylowastim ebid
im)
we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data
Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened
A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1
2 )2 + ( 12 )2 versus 12] Such dis-
crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias
Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM
3 THE CASE WITH INDEPENDENT NOISE
In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5
We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption
Assumption 3 The noise process satisfies the following
(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t
Hansen and Lunde Realized Variance and Market Microstructure Noise 133
Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA
(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t
The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes
referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise
134 Journal of Business amp Economic Statistics April 2006
Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1
The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)
Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen
var(RV(m)
)= κ12ω4m+ 8ω2msum
i=1
σ 2im
minus (6κ minus 2)ω4 + 2msum
i=1
σ 4im (2)
and
RV(m) minus 2mω2
radicκ12ω4m
=radic
m
3κ
(RV(m)
2mω2minus 1
)drarr N(01)
as m rarrinfin
Here we ldquodrarrrdquo to denote convergence in distribution Thus
unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)
is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim
In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that
var(RV(m)
)= 2msum
i=1
σ 4im = 2
bminus a
m
int b
aσ 4(s)ds+ o
(1
m
)
whereint b
a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)
Next we consider the estimator of Zhou (1996) given by
RV(m)AC1
equivmsum
i=1
y2im +
msumi=1
yimyiminus1m +msum
i=1
yimyi+1m (3)
This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in
much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1
i=2 y2im +
summi=2 yimyiminus1m +summminus1
i=1 yimyi+1m that estimatesint bminusδmma+δ1m
σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)
and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5
Next we formulate results for RV(m)AC1
that are similar to thosefor RV(m) in Lemma 2
Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)
AC1)= IV If Assumption 3(b) also holds then
var(RV(m)
AC1
)= 8ω4m+ 8ω2msum
i=1
σ 2im
minus 6ω4 + 6msum
i=1
σ 4im +O(mminus2)
under CTS and BTS and
RV(m)AC1
minus IVradic8ω4m
drarr N(01) as m rarrinfin
An important result of Lemma 3 is that RV(m)AC1
is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)
AC1can be rewrit-
ten in a way that does not involve squared noise terms u2im
i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)
AC1 has a smaller asymptotic vari-
ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4
vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)
AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges
in distribution to N(01) and simply write RV(m)AC1
radic
8ω4mdrarr
N(01) In other words whereas RV(m)AC1
is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin
Hansen and Lunde Realized Variance and Market Microstructure Noise 135
In the absence of market microstructure noise (ω2 = 0) wenote that
var[RV(m)
AC1
]asymp 6msum
i=1
σ 4im
which shows that the variance of RV(m)AC1
is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)
It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem
minτ1τm
msumi=1
τ 2i subject to
msumi=1
τi = c
Thus if we set τi = σ 2im and c = IV then we see that
summi=1 σ 4
im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS
summi=1 σ 4
im = IV2mand specifically it holds that
IV2
mle
int b
aσ 4(s)ds
bminus a
m
The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by
var[RV(m)
AC1
]asymp 8ω4m+ 8ω2int b
aσ 2(s)ds
minus 6ω4 + 6bminus a
m
int b
aσ 4(s)ds
Next we compare RV(m)AC1
and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators
Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2
im = IVm (BTS) The MSEs aregiven by
MSE(RV(m)
)= IV2[
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21
m
](4)
and
MSE(RV(m)
AC1
)= IV2[
8λ2m+ 8λminus 6λ2 + 61
mminus2
1
m2
] (5)
The optimal sampling frequencies for RV(m) and RV(m)AC1
are
given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0
We denote the optimal sampling frequencies for RV(m) andRV(m)
AC1by mlowast
0 and mlowast1 These are approximately given by
mlowast0 asymp (2λ)minus23 and mlowast
1 asympradic
3(2λ)minus1
The expression for mlowast0 was derived in Bandi and Russell (2005)
and Zhang et al (2005) under more general conditions than
those used in Corollary 2 whereas the expression for mlowast1 was
derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such
that
mlowast1mlowast
0 asymp 3122minus13(λminus1)13 ge 10
which shows that mlowast1 is several times larger than mlowast
0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)
AC1permits more frequent sampling than does
the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1
canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast
1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies
Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)
AC1in terms
of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)
AC1and for analyzing their (lack of )
robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and
RV(m)AC1
are proportional to the IV and given by r0(λm)IV andr1(λm)IV where
r0(λm)equivradic
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2
m
and
r1(λm)equivradic
8λ2m+ 8λminus 6λ2 + 6
mminus 2
m2
Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)
AC1dominates the RV(m) except at
the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast
0 and mlowast1 For the AA returns we find that the opti-
mal sampling frequencies are mlowast0AA = 44 and mlowast
1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast
0 and mlowast1 show that RV(m)
AC1is less sensitive than
RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of
RV(m)AC1
to that of (the optimal) RV(mlowast0) and the relative RMSE of
RV(m) to that of (the optimal) RV(mlowast
1)
AC1 These panels show that
the RV(m)AC1
continues to dominate the ldquooptimalrdquo RV(mlowast0) for a
wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast
1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast
1 are not known withcertainty The result shows that a reasonably precise estimate
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
128 Journal of Business amp Economic Statistics April 2006
The interest for empirical quantities based on high-frequencydata has surged in recent years (see Barndorff-Nielsen andShephard 2007 for a recent survey) The RV is a well-knownquantity that goes back to Merton (1980) Other empiricalquantities include bipower variation and multipower varia-tion which are particularly useful for detecting jumps (seeBarndorff-Nielsen and Shephard 2003 2004 2006ab Ander-sen Bollerslev and Diebold 2003 Bollerslev Kretschmer Pig-orsch and Tauchen 2005 Huang and Tauchen 2005 Tauchenand Zhou 2004) and intraday range-based estimators (seeChristensen and Podolskij 2005) High-frequencyndashbased quan-tities have proven useful for a number of problems For ex-ample several authors have applied filtering and smoothingtechniques to time series of the RV to obtain time seriesfor daily volatility (see eg Maheu and McCurdy 2002Barndorff-Nielsen Nielsen Shephard and Ysusi 1996 Engleand Sun 2005 Frijns and Lehnert 2004 Koopman Jungbackerand Hol 2005 Hansen and Lunde 2005b Owens and Steiger-wald 2005) High-frequencyndashbased quantities are also use-ful in the context of forecasting (see Andersen Bollerslevand Meddahi 2004 Ghysels Santa-Clara and Valkanov 2006)and the evaluation and comparison of volatility models (seeAndersen and Bollerslev 1998 Hansen Lunde and Nason2003 Hansen and Lunde 2005a 2006 Patton 2005)
The RV which is a sum of squared intraday returns yields aperfect estimate of volatility in the ideal situation where pricesare observed continuously and without measurement error (seeeg Merton 1980) This result suggests that the RV should bebased on intraday returns sampled at the highest possible fre-quency (tick-by-tick data) Unfortunately the RV suffers froma well-known bias problem that tends to get worse as the sam-pling frequency of intraday returns increases (see eg Fang1996 Andreou and Ghysels 2002 Oomen 2002 Bai Russelland Tiao 2004) The source of this bias problem is known asmarket microstructure noise and the bias is particularly ev-ident in volatility signature plots (see Andersen BollerslevDiebold and Labys 2000b) Thus there is a trade-off betweenbias and variance when choosing the sampling frequency asdiscussed by Bandi and Russell (2005) and Zhang Myklandand Aiumlt-Sahalia (2005) This trade-off is the reason that theRV is often computed from intraday returns sampled at a mod-erate frequency such as 5-minute or 20-minute sampling
A key insight into the problem of estimating the volatil-ity from high-frequency data comes from its similarity to theproblem of estimating the long-run variance of a stationarytime series In this literature it is well known that autocorre-lation necessitates modifications of the usual sum-of-squaredestimator Those modifications of Newey and West (1987) andAndrews (1991) provided such estimators that are robust to au-tocorrelation Market microstructure noise induces autocorre-lation in the intraday returns and this autocorrelation is thesource of the RVrsquos bias problem Given this connection to long-run variance estimation it is not surprising that ldquoprewhiten-ingrdquo of intraday returns and kernel-based estimators (includingthe closely related subsample-based estimators) are found tobe useful in the present context Zhou (1996) introduced theuse of kernel-based estimators and the subsampling idea todeal with market microstructure noise in high-frequency dataFiltering techniques have been used by Ebens (1999) Ander-sen Bollerslev Diebold and Ebens (2001) and Maheu and
McCurdy (2002) (moving average filter) and Bollen and In-der (2002) (autoregressive filter) Kernel-based estimators wereexplored by Zhou (1996) Hansen and Lunde (2003) andBarndorff-Nielsen Hansen Lunde and Shephard (2004) andthe closely related subsample-based estimators were used in anunpublished paper by Muumlller (1993) and also by Zhou (1996)Zhang et al (2005) and Zhang (2004)
The rest of the article is organized as follows In Section 2we describe our theoretical framework and discuss samplingschemes in calendar time and tick time We also characterizethe bias of the RV under a general specification for the noiseIn Section 3 we consider the case with independent market mi-crostructure noise which has been used by various authors in-cluding Corsi Zumbach Muumlller and Dacorogna (2001) Curciand Corsi (2004) Bandi and Russell (2005) and Zhang et al(2005) We consider a simple kernel-based estimator of Zhou(1996) that we denote by RVAC1 because it uses the first-orderautocorrelation to bias-correct the RV We benchmark RVAC1
to the standard measure of RV and find that the former is su-perior to the latter in terms of the mean squared error (MSE)We also evaluate the implications for some theoretical resultsbased on assumptions in which market microstructure noise isabsent Interestingly we find that the root mean squared er-ror (RMSE) of the RV in the presence of noise is quite simi-lar to those that ignore the noise at low sampling frequenciessuch as 20-minute sampling This finding is important becausemany existing empirical studies have drawn conclusions from20-minute and 30-minute intraday returns using the results ofBarndorff-Nielsen and Shephard (2002) However at 5-minutesampling we find that the ldquotruerdquo confidence interval about theRV can be as much as 100 larger than those based on an ldquoab-sence of noise assumptionrdquo In Section 4 we present a robustestimator that is unbiased for a general type of noise and dis-cuss noise that is time-dependent in both calendar time and ticktime We also discuss the subsampling version of Zhoursquos es-timator which is robust to some forms of time-dependence intick time In Section 5 we describe our data and present mostof our empirical results The key result is the overwhelming ev-idence against the independent noise assumption This findingis quite robust to the choice of sampling method (calendar timeor tick time) and the type of price data (transaction prices orquotation prices) This dependence structure has important im-plications for many quantities based on ultra-highndashfrequencydata These features of the noise have important implicationsfor some of the bias corrections that have been used in the liter-ature Although the independent noise assumption may be fairlyreasonable when the tick size is 116 it is clearly not consistentwith the recent data In fact much of the noise has ldquoevaporatedrdquoafter the tick size is reduced to 1 cent In Section 6 we presenta cointegration analysis of the vector of bid ask and transactionprices The Granger representation makes it possible to decom-pose each of the price series into noise and a common efficientprice Further based on this decomposition we estimate impulseresponse functions that reveal the dynamic effects on bid askand transaction prices as a response to a change in the efficientprice In Section 7 we provide a summary and we concludethe article with three appendixes that provide proofs and detailsabout our estimation methods
Hansen and Lunde Realized Variance and Market Microstructure Noise 129
2 THE THEORETICAL FRAMEWORK
We let plowast(t) denote a latent log-price process in continuoustime and use p(t) to denote the observable log-price processThus the noise process is given by
u(t)equiv p(t)minus plowast(t)
The noise process u may be due to market microstructure ef-fects such as bidndashask bounces but the discrepancy betweenp and plowast can also be induced by the technique used to con-struct p(t) For example p is often constructed artificially fromobserved transactions or quotes using the previous tick methodor the linear interpolation method which we define and discusslater in this section
We work under the following specification for the efficientprice process plowast
Assumption 1 The efficient price process satisfies dplowast(t) =σ(t)dw(t) where w(t) is a standard Brownian motion σ isa random function that is independent of w and σ 2(t) isLipschitz (almost surely)
In our analysis we condition on the volatility path σ 2(t)because our analysis focuses on estimators of the integratedvariance (IV)
IV equivint b
aσ 2(t)dt
Thus we can treat σ 2(t) as deterministic even though weview the volatility path as random The Lipschitz condition isa smoothness condition that requires |σ 2(t) minus σ 2(t + δ)| lt εδ
for some ε and all t and δ (with probability 1) The assump-tion that w and σ are independent is not essential The connec-tion between kernel-based and subsample-based estimators (seeBarndorff-Nielsen et al 2004) shows that weaker assumptionsused by Zhang et al (2005) and Zhang (2004) are sufficient inthis framework
We partition the interval [ab] into m subintervals andm plays a central role in our analysis For example we deriveasymptotic distributions of quantities as m rarrinfin This type ofinfill asymptotics is commonly used in spatial data analysis andgoes back to Stein (1987) Related to the present context is theuse of infill asymptotics for estimation of diffusions (see Bandiand Phillips 2004) For a fixed m the ith subinterval is givenby [timinus1m tim] where a = t0m lt t1m lt middot middot middot lt tmm = b Thelength of the ith subinterval is given by δim equiv tim minus timinus1m andwe assume that supi=1m δim = O( 1
m ) such that the lengthof each subinterval shrinks to 0 as m increases The intradayreturns are now defined by
ylowastim equiv plowast(tim)minus plowast(timinus1m) i = 1 m
and the increments in p and u are defined similarly and denotedby
yim equiv p(tim)minus p(timinus1m) i = 1 m
and
eim equiv u(tim)minus u(timinus1m) i = 1 m
Note that the observed intraday returns decompose into yim =ylowastim + eim The IV over each of the subintervals is definedby
σ 2im equiv
int tim
timinus1m
σ 2(s)ds i = 1 m
and we note that var(ylowastim) = E(ylowast2im) = σ 2
im under Assump-tion 1
The RV of plowast is defined by
RV(m)lowast equivmsum
i=1
ylowast2im
and RV(m)lowast is consistent for the IV as m rarrinfin (see eg Protter2005) A feasible asymptotic distribution theory of RV (in rela-tion to IV) was established by Barndorff-Nielsen and Shephard(2002) (see also Meddahi 2002 Mykland and Zhang 2006Gonccedilalves and Meddahi 2005) Whereas RV(m)lowast is an ideal es-timator it is not a feasible estimator because plowast is latent Therealized variance of p given by
RV(m) equivmsum
i=1
y2im
is observable but suffers from a well-known bias problem andis generally inconsistent for the IV (see eg Bandi and Russell2005 Zhang et al 2005)
21 Sampling Schemes
Intraday returns can be constructed using different types ofsampling schemes The special case where tim i = 1 mare equidistant in calendar time [ie δim = (bminus a)m for all i]is referred to as calendar time sampling (CTS) The widely usedexchange rates data from Olsen and associates (see Muumlller et al1990) are equidistant in time and 5-minute sampling (δim =5 min) is often used in practice
CTS requires the construction of artificial prices from theraw (irregularly spaced) price data (transaction prices or quo-tations) Given observed prices at the times t0 lt middot middot middot lt tN onecan construct a price at time τ isin [tj tj+1) using
p(τ )equiv ptj
or
p(τ )equiv ptj +τ minus tj
tj+1 minus tj
(ptj+1 minus ptj
)
The former is known as the previous tick method (Wasserfallenand Zimmermann 1985) and the latter is the linear interpola-tion method (see Andersen and Bollerslev 1997) Both methodshave been discussed by Dacorogna Gencay Muumlller Olsen andPictet (2001 sec 321) When sampling at ultra-high frequen-cies the linear interpolation method has the following unfortu-
nate property where ldquoprarrrdquo denotes convergence in probability
Lemma 1 Let N be fixed and consider the RV based onthe linear interpolation method It holds that RV(m) prarr 0 asm rarrinfin
130 Journal of Business amp Economic Statistics April 2006
The result of Lemma 1 essentially boils down to the fact thatthe quadratic variation of a straight line is zero Although thisis a limit result (as m rarrinfin) the lemma does suggest that thelinear interpolation method is not suitable for the constructionof intraday returns at high frequencies where sampling may oc-cur multiple times between two neighboring price observationsThat the result of Lemma 1 is more than a theoretical artifact isevident from the volatility signature plots of Hansen and Lunde(2003) Given the result of Lemma 1 we avoid the use of thelinear interpolation and use the previous tick method to con-struct CTS intraday returns
The case where tim denotes the time of a transactionquota-tion is referred to as tick time sampling (TTS) An example ofTTS is when tim i = 1 m are chosen to be the time ofevery fifth transaction say
The case where the sampling times t0m tmm are suchthat σ 2
im = IVm for all i = 1 m is known as businesstime sampling (BTS) (see Oomen 2006) Zhou (1998) referredto BTS intraday returns as de-volatized returns and discusseddistributional advantages of BTS returns Whereas tim i =0 m are observable under CTS and TTS they are latentunder BTS because the sampling times are defined from theunobserved volatility path Empirical results of Andersen andBollerslev (1997) and Curci and Corsi (2004) suggest that BTScan be approximated by TTS This feature is nicely capturedin the framework of Oomen (2006) where the (random) ticktimes are generated with an intensity directly related to a quan-tity corresponding to σ 2(t) in the present context Under CTSwe sometimes write RV(x sec) where x seconds is the periodin time spanned by each of the intraday returns (ie δim = xseconds) Similarly we write RV(y tick) under TTS when eachintraday return spans y ticks (transactions or quotations)
22 Characterizing the Bias of the Realized VarianceUnder General Noise
Initially we make the following assumptions about the noiseprocess u
Assumption 2 The noise process u is covariance stationarywith mean 0 such that its autocovariance function is defined byπ(s)equiv E[u(t)u(t + s)]
The covariance function π plays a key role because thebias of RV(m) is tied to the properties of π(s) in the neighbor-hood of 0 Simple examples of noise processes that satisfy As-sumption 2 include the independent noise process which hasπ(s)= 0 for all s = 0 and the OrnsteinndashUhlenbeck processThe latter was used by Aiumlt-Sahalia Mykland and Zhang(2005a) to study estimation in a parametric diffusion modelthat is robust to market microstructure noise
An important aspect of our analysis is that our assumptionsallow for a dependence between u and plowast This is a generaliza-tion of the assumptions made in the existing literature and ourempirical analysis shows that this generalization is needed inparticularly when prices are sampled from quotations
Next we characterize the RV bias under these general as-sumptions for the market microstructure noise u
Theorem 1 Given Assumptions 1 and 2 the bias of the real-ized variance under CTS is given by
E[RV(m) minus IV
]= 2ρm + 2m
[π(0)minus π
(bminus a
m
)] (1)
where ρm equiv E(summ
i=1 ylowastimeim)
The result of Theorem 1 is based on the following decompo-sition of the observed RV
RV(m) =msum
i=1
ylowast2im + 2
msumi=1
eimylowastim +msum
i=1
e2im
wheresumm
i=1 e2im is the ldquorealized variancerdquo of the noise process
u responsible for the last bias term in (1) The dependence be-tween u and plowast that is relevant for our analysis is given in theform of the correlation between the efficient intraday returnsylowastim and the return noise eim By the CauchyndashSchwarz in-equality π(0) ge π(s) for all s such that the bias is alwayspositive when the return noise process eim is uncorrelatedwith the efficient intraday returns ylowastim (because this implies thatρm = 0) Interestingly the total bias can be negative This oc-curs when ρm lt minusm[π(0)minus π(m)] which is the case wherethe downward bias (caused by a negative correlation betweeneim and ylowastim) exceeds the upward bias caused by the ldquorealizedvariancerdquo of u This appears to be the case for the RVs that arebased on quoted prices as shown in Figure 1
The last term of the bias expression in Theorem 1 shows thatthe bias is tied to the properties of π(s) in the neighborhoodof 0 and as m rarrinfin (hence δm rarr 0) we obtain the followingresult
Corollary 1 Suppose that the assumptions of Theorem 1hold and that π(s) is differentiable at 0 Then the asymptoticbias is given by
limmrarrinfinE
[RV(m) minus IV
]= 2ρ minus 2(bminus a)π prime(0)
provided that ρ equiv limmrarrinfin E(summ
i=1 ylowastimeim) is well defined
Under the independent noise assumption we can defineπ prime(0) = minusinfin which is the situation that we analyze in detailin Section 3 A related asymptotic result is obtained wheneverthe quadratic variation of the bivariate process (plowastu)prime is welldefined such that [pp] = [plowastplowast] + 2[plowastu] + [uu] where[XY] denotes the quadratic covariation In this setting we haveIV = [plowastplowast] such that
RV(m) minus IVprarr 2[plowastu] + [uu] (as m rarrinfin)
where ρ = [plowastu] and minus2(b minus a)π prime(0) = [uu] (almost surelyunder additional assumptions)
A volatility signature plot provides an easy way to visuallyinspect the potential bias problems of RV-type estimators Suchplots first appeared in an unpublished thesis by Fang (1996)and were named and made popular by Andersen et al (2000b)Let RV(m)
t denote the RV based on m intraday returns on day tA volatility signature plot displays the sample average
RV(m) equiv nminus1nsum
t=1
RV(m)t
Hansen and Lunde Realized Variance and Market Microstructure Noise 131
Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4
The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents
an approximate 95 confidence interval for the average volatility
132 Journal of Business amp Economic Statistics April 2006
as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)
Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)
t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)
ACNW30 defined in Section 42 The shaded area about
σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B
From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)
A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise
Fact I The noise is negatively correlated with the efficientreturns
Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa
m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm
is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From
cov(ylowastim emidim )= 1
2cov(ylowastim eask
im)+ 1
2cov(ylowastim ebid
im)
we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data
Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened
A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1
2 )2 + ( 12 )2 versus 12] Such dis-
crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias
Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM
3 THE CASE WITH INDEPENDENT NOISE
In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5
We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption
Assumption 3 The noise process satisfies the following
(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t
Hansen and Lunde Realized Variance and Market Microstructure Noise 133
Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA
(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t
The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes
referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise
134 Journal of Business amp Economic Statistics April 2006
Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1
The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)
Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen
var(RV(m)
)= κ12ω4m+ 8ω2msum
i=1
σ 2im
minus (6κ minus 2)ω4 + 2msum
i=1
σ 4im (2)
and
RV(m) minus 2mω2
radicκ12ω4m
=radic
m
3κ
(RV(m)
2mω2minus 1
)drarr N(01)
as m rarrinfin
Here we ldquodrarrrdquo to denote convergence in distribution Thus
unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)
is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim
In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that
var(RV(m)
)= 2msum
i=1
σ 4im = 2
bminus a
m
int b
aσ 4(s)ds+ o
(1
m
)
whereint b
a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)
Next we consider the estimator of Zhou (1996) given by
RV(m)AC1
equivmsum
i=1
y2im +
msumi=1
yimyiminus1m +msum
i=1
yimyi+1m (3)
This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in
much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1
i=2 y2im +
summi=2 yimyiminus1m +summminus1
i=1 yimyi+1m that estimatesint bminusδmma+δ1m
σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)
and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5
Next we formulate results for RV(m)AC1
that are similar to thosefor RV(m) in Lemma 2
Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)
AC1)= IV If Assumption 3(b) also holds then
var(RV(m)
AC1
)= 8ω4m+ 8ω2msum
i=1
σ 2im
minus 6ω4 + 6msum
i=1
σ 4im +O(mminus2)
under CTS and BTS and
RV(m)AC1
minus IVradic8ω4m
drarr N(01) as m rarrinfin
An important result of Lemma 3 is that RV(m)AC1
is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)
AC1can be rewrit-
ten in a way that does not involve squared noise terms u2im
i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)
AC1 has a smaller asymptotic vari-
ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4
vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)
AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges
in distribution to N(01) and simply write RV(m)AC1
radic
8ω4mdrarr
N(01) In other words whereas RV(m)AC1
is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin
Hansen and Lunde Realized Variance and Market Microstructure Noise 135
In the absence of market microstructure noise (ω2 = 0) wenote that
var[RV(m)
AC1
]asymp 6msum
i=1
σ 4im
which shows that the variance of RV(m)AC1
is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)
It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem
minτ1τm
msumi=1
τ 2i subject to
msumi=1
τi = c
Thus if we set τi = σ 2im and c = IV then we see that
summi=1 σ 4
im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS
summi=1 σ 4
im = IV2mand specifically it holds that
IV2
mle
int b
aσ 4(s)ds
bminus a
m
The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by
var[RV(m)
AC1
]asymp 8ω4m+ 8ω2int b
aσ 2(s)ds
minus 6ω4 + 6bminus a
m
int b
aσ 4(s)ds
Next we compare RV(m)AC1
and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators
Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2
im = IVm (BTS) The MSEs aregiven by
MSE(RV(m)
)= IV2[
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21
m
](4)
and
MSE(RV(m)
AC1
)= IV2[
8λ2m+ 8λminus 6λ2 + 61
mminus2
1
m2
] (5)
The optimal sampling frequencies for RV(m) and RV(m)AC1
are
given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0
We denote the optimal sampling frequencies for RV(m) andRV(m)
AC1by mlowast
0 and mlowast1 These are approximately given by
mlowast0 asymp (2λ)minus23 and mlowast
1 asympradic
3(2λ)minus1
The expression for mlowast0 was derived in Bandi and Russell (2005)
and Zhang et al (2005) under more general conditions than
those used in Corollary 2 whereas the expression for mlowast1 was
derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such
that
mlowast1mlowast
0 asymp 3122minus13(λminus1)13 ge 10
which shows that mlowast1 is several times larger than mlowast
0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)
AC1permits more frequent sampling than does
the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1
canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast
1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies
Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)
AC1in terms
of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)
AC1and for analyzing their (lack of )
robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and
RV(m)AC1
are proportional to the IV and given by r0(λm)IV andr1(λm)IV where
r0(λm)equivradic
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2
m
and
r1(λm)equivradic
8λ2m+ 8λminus 6λ2 + 6
mminus 2
m2
Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)
AC1dominates the RV(m) except at
the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast
0 and mlowast1 For the AA returns we find that the opti-
mal sampling frequencies are mlowast0AA = 44 and mlowast
1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast
0 and mlowast1 show that RV(m)
AC1is less sensitive than
RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of
RV(m)AC1
to that of (the optimal) RV(mlowast0) and the relative RMSE of
RV(m) to that of (the optimal) RV(mlowast
1)
AC1 These panels show that
the RV(m)AC1
continues to dominate the ldquooptimalrdquo RV(mlowast0) for a
wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast
1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast
1 are not known withcertainty The result shows that a reasonably precise estimate
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 129
2 THE THEORETICAL FRAMEWORK
We let plowast(t) denote a latent log-price process in continuoustime and use p(t) to denote the observable log-price processThus the noise process is given by
u(t)equiv p(t)minus plowast(t)
The noise process u may be due to market microstructure ef-fects such as bidndashask bounces but the discrepancy betweenp and plowast can also be induced by the technique used to con-struct p(t) For example p is often constructed artificially fromobserved transactions or quotes using the previous tick methodor the linear interpolation method which we define and discusslater in this section
We work under the following specification for the efficientprice process plowast
Assumption 1 The efficient price process satisfies dplowast(t) =σ(t)dw(t) where w(t) is a standard Brownian motion σ isa random function that is independent of w and σ 2(t) isLipschitz (almost surely)
In our analysis we condition on the volatility path σ 2(t)because our analysis focuses on estimators of the integratedvariance (IV)
IV equivint b
aσ 2(t)dt
Thus we can treat σ 2(t) as deterministic even though weview the volatility path as random The Lipschitz condition isa smoothness condition that requires |σ 2(t) minus σ 2(t + δ)| lt εδ
for some ε and all t and δ (with probability 1) The assump-tion that w and σ are independent is not essential The connec-tion between kernel-based and subsample-based estimators (seeBarndorff-Nielsen et al 2004) shows that weaker assumptionsused by Zhang et al (2005) and Zhang (2004) are sufficient inthis framework
We partition the interval [ab] into m subintervals andm plays a central role in our analysis For example we deriveasymptotic distributions of quantities as m rarrinfin This type ofinfill asymptotics is commonly used in spatial data analysis andgoes back to Stein (1987) Related to the present context is theuse of infill asymptotics for estimation of diffusions (see Bandiand Phillips 2004) For a fixed m the ith subinterval is givenby [timinus1m tim] where a = t0m lt t1m lt middot middot middot lt tmm = b Thelength of the ith subinterval is given by δim equiv tim minus timinus1m andwe assume that supi=1m δim = O( 1
m ) such that the lengthof each subinterval shrinks to 0 as m increases The intradayreturns are now defined by
ylowastim equiv plowast(tim)minus plowast(timinus1m) i = 1 m
and the increments in p and u are defined similarly and denotedby
yim equiv p(tim)minus p(timinus1m) i = 1 m
and
eim equiv u(tim)minus u(timinus1m) i = 1 m
Note that the observed intraday returns decompose into yim =ylowastim + eim The IV over each of the subintervals is definedby
σ 2im equiv
int tim
timinus1m
σ 2(s)ds i = 1 m
and we note that var(ylowastim) = E(ylowast2im) = σ 2
im under Assump-tion 1
The RV of plowast is defined by
RV(m)lowast equivmsum
i=1
ylowast2im
and RV(m)lowast is consistent for the IV as m rarrinfin (see eg Protter2005) A feasible asymptotic distribution theory of RV (in rela-tion to IV) was established by Barndorff-Nielsen and Shephard(2002) (see also Meddahi 2002 Mykland and Zhang 2006Gonccedilalves and Meddahi 2005) Whereas RV(m)lowast is an ideal es-timator it is not a feasible estimator because plowast is latent Therealized variance of p given by
RV(m) equivmsum
i=1
y2im
is observable but suffers from a well-known bias problem andis generally inconsistent for the IV (see eg Bandi and Russell2005 Zhang et al 2005)
21 Sampling Schemes
Intraday returns can be constructed using different types ofsampling schemes The special case where tim i = 1 mare equidistant in calendar time [ie δim = (bminus a)m for all i]is referred to as calendar time sampling (CTS) The widely usedexchange rates data from Olsen and associates (see Muumlller et al1990) are equidistant in time and 5-minute sampling (δim =5 min) is often used in practice
CTS requires the construction of artificial prices from theraw (irregularly spaced) price data (transaction prices or quo-tations) Given observed prices at the times t0 lt middot middot middot lt tN onecan construct a price at time τ isin [tj tj+1) using
p(τ )equiv ptj
or
p(τ )equiv ptj +τ minus tj
tj+1 minus tj
(ptj+1 minus ptj
)
The former is known as the previous tick method (Wasserfallenand Zimmermann 1985) and the latter is the linear interpola-tion method (see Andersen and Bollerslev 1997) Both methodshave been discussed by Dacorogna Gencay Muumlller Olsen andPictet (2001 sec 321) When sampling at ultra-high frequen-cies the linear interpolation method has the following unfortu-
nate property where ldquoprarrrdquo denotes convergence in probability
Lemma 1 Let N be fixed and consider the RV based onthe linear interpolation method It holds that RV(m) prarr 0 asm rarrinfin
130 Journal of Business amp Economic Statistics April 2006
The result of Lemma 1 essentially boils down to the fact thatthe quadratic variation of a straight line is zero Although thisis a limit result (as m rarrinfin) the lemma does suggest that thelinear interpolation method is not suitable for the constructionof intraday returns at high frequencies where sampling may oc-cur multiple times between two neighboring price observationsThat the result of Lemma 1 is more than a theoretical artifact isevident from the volatility signature plots of Hansen and Lunde(2003) Given the result of Lemma 1 we avoid the use of thelinear interpolation and use the previous tick method to con-struct CTS intraday returns
The case where tim denotes the time of a transactionquota-tion is referred to as tick time sampling (TTS) An example ofTTS is when tim i = 1 m are chosen to be the time ofevery fifth transaction say
The case where the sampling times t0m tmm are suchthat σ 2
im = IVm for all i = 1 m is known as businesstime sampling (BTS) (see Oomen 2006) Zhou (1998) referredto BTS intraday returns as de-volatized returns and discusseddistributional advantages of BTS returns Whereas tim i =0 m are observable under CTS and TTS they are latentunder BTS because the sampling times are defined from theunobserved volatility path Empirical results of Andersen andBollerslev (1997) and Curci and Corsi (2004) suggest that BTScan be approximated by TTS This feature is nicely capturedin the framework of Oomen (2006) where the (random) ticktimes are generated with an intensity directly related to a quan-tity corresponding to σ 2(t) in the present context Under CTSwe sometimes write RV(x sec) where x seconds is the periodin time spanned by each of the intraday returns (ie δim = xseconds) Similarly we write RV(y tick) under TTS when eachintraday return spans y ticks (transactions or quotations)
22 Characterizing the Bias of the Realized VarianceUnder General Noise
Initially we make the following assumptions about the noiseprocess u
Assumption 2 The noise process u is covariance stationarywith mean 0 such that its autocovariance function is defined byπ(s)equiv E[u(t)u(t + s)]
The covariance function π plays a key role because thebias of RV(m) is tied to the properties of π(s) in the neighbor-hood of 0 Simple examples of noise processes that satisfy As-sumption 2 include the independent noise process which hasπ(s)= 0 for all s = 0 and the OrnsteinndashUhlenbeck processThe latter was used by Aiumlt-Sahalia Mykland and Zhang(2005a) to study estimation in a parametric diffusion modelthat is robust to market microstructure noise
An important aspect of our analysis is that our assumptionsallow for a dependence between u and plowast This is a generaliza-tion of the assumptions made in the existing literature and ourempirical analysis shows that this generalization is needed inparticularly when prices are sampled from quotations
Next we characterize the RV bias under these general as-sumptions for the market microstructure noise u
Theorem 1 Given Assumptions 1 and 2 the bias of the real-ized variance under CTS is given by
E[RV(m) minus IV
]= 2ρm + 2m
[π(0)minus π
(bminus a
m
)] (1)
where ρm equiv E(summ
i=1 ylowastimeim)
The result of Theorem 1 is based on the following decompo-sition of the observed RV
RV(m) =msum
i=1
ylowast2im + 2
msumi=1
eimylowastim +msum
i=1
e2im
wheresumm
i=1 e2im is the ldquorealized variancerdquo of the noise process
u responsible for the last bias term in (1) The dependence be-tween u and plowast that is relevant for our analysis is given in theform of the correlation between the efficient intraday returnsylowastim and the return noise eim By the CauchyndashSchwarz in-equality π(0) ge π(s) for all s such that the bias is alwayspositive when the return noise process eim is uncorrelatedwith the efficient intraday returns ylowastim (because this implies thatρm = 0) Interestingly the total bias can be negative This oc-curs when ρm lt minusm[π(0)minus π(m)] which is the case wherethe downward bias (caused by a negative correlation betweeneim and ylowastim) exceeds the upward bias caused by the ldquorealizedvariancerdquo of u This appears to be the case for the RVs that arebased on quoted prices as shown in Figure 1
The last term of the bias expression in Theorem 1 shows thatthe bias is tied to the properties of π(s) in the neighborhoodof 0 and as m rarrinfin (hence δm rarr 0) we obtain the followingresult
Corollary 1 Suppose that the assumptions of Theorem 1hold and that π(s) is differentiable at 0 Then the asymptoticbias is given by
limmrarrinfinE
[RV(m) minus IV
]= 2ρ minus 2(bminus a)π prime(0)
provided that ρ equiv limmrarrinfin E(summ
i=1 ylowastimeim) is well defined
Under the independent noise assumption we can defineπ prime(0) = minusinfin which is the situation that we analyze in detailin Section 3 A related asymptotic result is obtained wheneverthe quadratic variation of the bivariate process (plowastu)prime is welldefined such that [pp] = [plowastplowast] + 2[plowastu] + [uu] where[XY] denotes the quadratic covariation In this setting we haveIV = [plowastplowast] such that
RV(m) minus IVprarr 2[plowastu] + [uu] (as m rarrinfin)
where ρ = [plowastu] and minus2(b minus a)π prime(0) = [uu] (almost surelyunder additional assumptions)
A volatility signature plot provides an easy way to visuallyinspect the potential bias problems of RV-type estimators Suchplots first appeared in an unpublished thesis by Fang (1996)and were named and made popular by Andersen et al (2000b)Let RV(m)
t denote the RV based on m intraday returns on day tA volatility signature plot displays the sample average
RV(m) equiv nminus1nsum
t=1
RV(m)t
Hansen and Lunde Realized Variance and Market Microstructure Noise 131
Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4
The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents
an approximate 95 confidence interval for the average volatility
132 Journal of Business amp Economic Statistics April 2006
as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)
Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)
t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)
ACNW30 defined in Section 42 The shaded area about
σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B
From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)
A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise
Fact I The noise is negatively correlated with the efficientreturns
Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa
m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm
is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From
cov(ylowastim emidim )= 1
2cov(ylowastim eask
im)+ 1
2cov(ylowastim ebid
im)
we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data
Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened
A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1
2 )2 + ( 12 )2 versus 12] Such dis-
crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias
Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM
3 THE CASE WITH INDEPENDENT NOISE
In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5
We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption
Assumption 3 The noise process satisfies the following
(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t
Hansen and Lunde Realized Variance and Market Microstructure Noise 133
Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA
(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t
The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes
referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise
134 Journal of Business amp Economic Statistics April 2006
Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1
The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)
Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen
var(RV(m)
)= κ12ω4m+ 8ω2msum
i=1
σ 2im
minus (6κ minus 2)ω4 + 2msum
i=1
σ 4im (2)
and
RV(m) minus 2mω2
radicκ12ω4m
=radic
m
3κ
(RV(m)
2mω2minus 1
)drarr N(01)
as m rarrinfin
Here we ldquodrarrrdquo to denote convergence in distribution Thus
unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)
is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim
In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that
var(RV(m)
)= 2msum
i=1
σ 4im = 2
bminus a
m
int b
aσ 4(s)ds+ o
(1
m
)
whereint b
a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)
Next we consider the estimator of Zhou (1996) given by
RV(m)AC1
equivmsum
i=1
y2im +
msumi=1
yimyiminus1m +msum
i=1
yimyi+1m (3)
This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in
much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1
i=2 y2im +
summi=2 yimyiminus1m +summminus1
i=1 yimyi+1m that estimatesint bminusδmma+δ1m
σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)
and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5
Next we formulate results for RV(m)AC1
that are similar to thosefor RV(m) in Lemma 2
Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)
AC1)= IV If Assumption 3(b) also holds then
var(RV(m)
AC1
)= 8ω4m+ 8ω2msum
i=1
σ 2im
minus 6ω4 + 6msum
i=1
σ 4im +O(mminus2)
under CTS and BTS and
RV(m)AC1
minus IVradic8ω4m
drarr N(01) as m rarrinfin
An important result of Lemma 3 is that RV(m)AC1
is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)
AC1can be rewrit-
ten in a way that does not involve squared noise terms u2im
i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)
AC1 has a smaller asymptotic vari-
ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4
vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)
AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges
in distribution to N(01) and simply write RV(m)AC1
radic
8ω4mdrarr
N(01) In other words whereas RV(m)AC1
is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin
Hansen and Lunde Realized Variance and Market Microstructure Noise 135
In the absence of market microstructure noise (ω2 = 0) wenote that
var[RV(m)
AC1
]asymp 6msum
i=1
σ 4im
which shows that the variance of RV(m)AC1
is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)
It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem
minτ1τm
msumi=1
τ 2i subject to
msumi=1
τi = c
Thus if we set τi = σ 2im and c = IV then we see that
summi=1 σ 4
im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS
summi=1 σ 4
im = IV2mand specifically it holds that
IV2
mle
int b
aσ 4(s)ds
bminus a
m
The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by
var[RV(m)
AC1
]asymp 8ω4m+ 8ω2int b
aσ 2(s)ds
minus 6ω4 + 6bminus a
m
int b
aσ 4(s)ds
Next we compare RV(m)AC1
and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators
Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2
im = IVm (BTS) The MSEs aregiven by
MSE(RV(m)
)= IV2[
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21
m
](4)
and
MSE(RV(m)
AC1
)= IV2[
8λ2m+ 8λminus 6λ2 + 61
mminus2
1
m2
] (5)
The optimal sampling frequencies for RV(m) and RV(m)AC1
are
given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0
We denote the optimal sampling frequencies for RV(m) andRV(m)
AC1by mlowast
0 and mlowast1 These are approximately given by
mlowast0 asymp (2λ)minus23 and mlowast
1 asympradic
3(2λ)minus1
The expression for mlowast0 was derived in Bandi and Russell (2005)
and Zhang et al (2005) under more general conditions than
those used in Corollary 2 whereas the expression for mlowast1 was
derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such
that
mlowast1mlowast
0 asymp 3122minus13(λminus1)13 ge 10
which shows that mlowast1 is several times larger than mlowast
0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)
AC1permits more frequent sampling than does
the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1
canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast
1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies
Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)
AC1in terms
of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)
AC1and for analyzing their (lack of )
robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and
RV(m)AC1
are proportional to the IV and given by r0(λm)IV andr1(λm)IV where
r0(λm)equivradic
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2
m
and
r1(λm)equivradic
8λ2m+ 8λminus 6λ2 + 6
mminus 2
m2
Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)
AC1dominates the RV(m) except at
the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast
0 and mlowast1 For the AA returns we find that the opti-
mal sampling frequencies are mlowast0AA = 44 and mlowast
1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast
0 and mlowast1 show that RV(m)
AC1is less sensitive than
RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of
RV(m)AC1
to that of (the optimal) RV(mlowast0) and the relative RMSE of
RV(m) to that of (the optimal) RV(mlowast
1)
AC1 These panels show that
the RV(m)AC1
continues to dominate the ldquooptimalrdquo RV(mlowast0) for a
wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast
1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast
1 are not known withcertainty The result shows that a reasonably precise estimate
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
130 Journal of Business amp Economic Statistics April 2006
The result of Lemma 1 essentially boils down to the fact thatthe quadratic variation of a straight line is zero Although thisis a limit result (as m rarrinfin) the lemma does suggest that thelinear interpolation method is not suitable for the constructionof intraday returns at high frequencies where sampling may oc-cur multiple times between two neighboring price observationsThat the result of Lemma 1 is more than a theoretical artifact isevident from the volatility signature plots of Hansen and Lunde(2003) Given the result of Lemma 1 we avoid the use of thelinear interpolation and use the previous tick method to con-struct CTS intraday returns
The case where tim denotes the time of a transactionquota-tion is referred to as tick time sampling (TTS) An example ofTTS is when tim i = 1 m are chosen to be the time ofevery fifth transaction say
The case where the sampling times t0m tmm are suchthat σ 2
im = IVm for all i = 1 m is known as businesstime sampling (BTS) (see Oomen 2006) Zhou (1998) referredto BTS intraday returns as de-volatized returns and discusseddistributional advantages of BTS returns Whereas tim i =0 m are observable under CTS and TTS they are latentunder BTS because the sampling times are defined from theunobserved volatility path Empirical results of Andersen andBollerslev (1997) and Curci and Corsi (2004) suggest that BTScan be approximated by TTS This feature is nicely capturedin the framework of Oomen (2006) where the (random) ticktimes are generated with an intensity directly related to a quan-tity corresponding to σ 2(t) in the present context Under CTSwe sometimes write RV(x sec) where x seconds is the periodin time spanned by each of the intraday returns (ie δim = xseconds) Similarly we write RV(y tick) under TTS when eachintraday return spans y ticks (transactions or quotations)
22 Characterizing the Bias of the Realized VarianceUnder General Noise
Initially we make the following assumptions about the noiseprocess u
Assumption 2 The noise process u is covariance stationarywith mean 0 such that its autocovariance function is defined byπ(s)equiv E[u(t)u(t + s)]
The covariance function π plays a key role because thebias of RV(m) is tied to the properties of π(s) in the neighbor-hood of 0 Simple examples of noise processes that satisfy As-sumption 2 include the independent noise process which hasπ(s)= 0 for all s = 0 and the OrnsteinndashUhlenbeck processThe latter was used by Aiumlt-Sahalia Mykland and Zhang(2005a) to study estimation in a parametric diffusion modelthat is robust to market microstructure noise
An important aspect of our analysis is that our assumptionsallow for a dependence between u and plowast This is a generaliza-tion of the assumptions made in the existing literature and ourempirical analysis shows that this generalization is needed inparticularly when prices are sampled from quotations
Next we characterize the RV bias under these general as-sumptions for the market microstructure noise u
Theorem 1 Given Assumptions 1 and 2 the bias of the real-ized variance under CTS is given by
E[RV(m) minus IV
]= 2ρm + 2m
[π(0)minus π
(bminus a
m
)] (1)
where ρm equiv E(summ
i=1 ylowastimeim)
The result of Theorem 1 is based on the following decompo-sition of the observed RV
RV(m) =msum
i=1
ylowast2im + 2
msumi=1
eimylowastim +msum
i=1
e2im
wheresumm
i=1 e2im is the ldquorealized variancerdquo of the noise process
u responsible for the last bias term in (1) The dependence be-tween u and plowast that is relevant for our analysis is given in theform of the correlation between the efficient intraday returnsylowastim and the return noise eim By the CauchyndashSchwarz in-equality π(0) ge π(s) for all s such that the bias is alwayspositive when the return noise process eim is uncorrelatedwith the efficient intraday returns ylowastim (because this implies thatρm = 0) Interestingly the total bias can be negative This oc-curs when ρm lt minusm[π(0)minus π(m)] which is the case wherethe downward bias (caused by a negative correlation betweeneim and ylowastim) exceeds the upward bias caused by the ldquorealizedvariancerdquo of u This appears to be the case for the RVs that arebased on quoted prices as shown in Figure 1
The last term of the bias expression in Theorem 1 shows thatthe bias is tied to the properties of π(s) in the neighborhoodof 0 and as m rarrinfin (hence δm rarr 0) we obtain the followingresult
Corollary 1 Suppose that the assumptions of Theorem 1hold and that π(s) is differentiable at 0 Then the asymptoticbias is given by
limmrarrinfinE
[RV(m) minus IV
]= 2ρ minus 2(bminus a)π prime(0)
provided that ρ equiv limmrarrinfin E(summ
i=1 ylowastimeim) is well defined
Under the independent noise assumption we can defineπ prime(0) = minusinfin which is the situation that we analyze in detailin Section 3 A related asymptotic result is obtained wheneverthe quadratic variation of the bivariate process (plowastu)prime is welldefined such that [pp] = [plowastplowast] + 2[plowastu] + [uu] where[XY] denotes the quadratic covariation In this setting we haveIV = [plowastplowast] such that
RV(m) minus IVprarr 2[plowastu] + [uu] (as m rarrinfin)
where ρ = [plowastu] and minus2(b minus a)π prime(0) = [uu] (almost surelyunder additional assumptions)
A volatility signature plot provides an easy way to visuallyinspect the potential bias problems of RV-type estimators Suchplots first appeared in an unpublished thesis by Fang (1996)and were named and made popular by Andersen et al (2000b)Let RV(m)
t denote the RV based on m intraday returns on day tA volatility signature plot displays the sample average
RV(m) equiv nminus1nsum
t=1
RV(m)t
Hansen and Lunde Realized Variance and Market Microstructure Noise 131
Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4
The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents
an approximate 95 confidence interval for the average volatility
132 Journal of Business amp Economic Statistics April 2006
as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)
Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)
t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)
ACNW30 defined in Section 42 The shaded area about
σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B
From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)
A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise
Fact I The noise is negatively correlated with the efficientreturns
Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa
m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm
is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From
cov(ylowastim emidim )= 1
2cov(ylowastim eask
im)+ 1
2cov(ylowastim ebid
im)
we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data
Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened
A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1
2 )2 + ( 12 )2 versus 12] Such dis-
crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias
Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM
3 THE CASE WITH INDEPENDENT NOISE
In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5
We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption
Assumption 3 The noise process satisfies the following
(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t
Hansen and Lunde Realized Variance and Market Microstructure Noise 133
Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA
(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t
The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes
referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise
134 Journal of Business amp Economic Statistics April 2006
Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1
The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)
Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen
var(RV(m)
)= κ12ω4m+ 8ω2msum
i=1
σ 2im
minus (6κ minus 2)ω4 + 2msum
i=1
σ 4im (2)
and
RV(m) minus 2mω2
radicκ12ω4m
=radic
m
3κ
(RV(m)
2mω2minus 1
)drarr N(01)
as m rarrinfin
Here we ldquodrarrrdquo to denote convergence in distribution Thus
unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)
is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim
In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that
var(RV(m)
)= 2msum
i=1
σ 4im = 2
bminus a
m
int b
aσ 4(s)ds+ o
(1
m
)
whereint b
a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)
Next we consider the estimator of Zhou (1996) given by
RV(m)AC1
equivmsum
i=1
y2im +
msumi=1
yimyiminus1m +msum
i=1
yimyi+1m (3)
This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in
much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1
i=2 y2im +
summi=2 yimyiminus1m +summminus1
i=1 yimyi+1m that estimatesint bminusδmma+δ1m
σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)
and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5
Next we formulate results for RV(m)AC1
that are similar to thosefor RV(m) in Lemma 2
Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)
AC1)= IV If Assumption 3(b) also holds then
var(RV(m)
AC1
)= 8ω4m+ 8ω2msum
i=1
σ 2im
minus 6ω4 + 6msum
i=1
σ 4im +O(mminus2)
under CTS and BTS and
RV(m)AC1
minus IVradic8ω4m
drarr N(01) as m rarrinfin
An important result of Lemma 3 is that RV(m)AC1
is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)
AC1can be rewrit-
ten in a way that does not involve squared noise terms u2im
i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)
AC1 has a smaller asymptotic vari-
ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4
vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)
AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges
in distribution to N(01) and simply write RV(m)AC1
radic
8ω4mdrarr
N(01) In other words whereas RV(m)AC1
is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin
Hansen and Lunde Realized Variance and Market Microstructure Noise 135
In the absence of market microstructure noise (ω2 = 0) wenote that
var[RV(m)
AC1
]asymp 6msum
i=1
σ 4im
which shows that the variance of RV(m)AC1
is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)
It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem
minτ1τm
msumi=1
τ 2i subject to
msumi=1
τi = c
Thus if we set τi = σ 2im and c = IV then we see that
summi=1 σ 4
im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS
summi=1 σ 4
im = IV2mand specifically it holds that
IV2
mle
int b
aσ 4(s)ds
bminus a
m
The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by
var[RV(m)
AC1
]asymp 8ω4m+ 8ω2int b
aσ 2(s)ds
minus 6ω4 + 6bminus a
m
int b
aσ 4(s)ds
Next we compare RV(m)AC1
and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators
Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2
im = IVm (BTS) The MSEs aregiven by
MSE(RV(m)
)= IV2[
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21
m
](4)
and
MSE(RV(m)
AC1
)= IV2[
8λ2m+ 8λminus 6λ2 + 61
mminus2
1
m2
] (5)
The optimal sampling frequencies for RV(m) and RV(m)AC1
are
given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0
We denote the optimal sampling frequencies for RV(m) andRV(m)
AC1by mlowast
0 and mlowast1 These are approximately given by
mlowast0 asymp (2λ)minus23 and mlowast
1 asympradic
3(2λ)minus1
The expression for mlowast0 was derived in Bandi and Russell (2005)
and Zhang et al (2005) under more general conditions than
those used in Corollary 2 whereas the expression for mlowast1 was
derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such
that
mlowast1mlowast
0 asymp 3122minus13(λminus1)13 ge 10
which shows that mlowast1 is several times larger than mlowast
0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)
AC1permits more frequent sampling than does
the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1
canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast
1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies
Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)
AC1in terms
of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)
AC1and for analyzing their (lack of )
robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and
RV(m)AC1
are proportional to the IV and given by r0(λm)IV andr1(λm)IV where
r0(λm)equivradic
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2
m
and
r1(λm)equivradic
8λ2m+ 8λminus 6λ2 + 6
mminus 2
m2
Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)
AC1dominates the RV(m) except at
the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast
0 and mlowast1 For the AA returns we find that the opti-
mal sampling frequencies are mlowast0AA = 44 and mlowast
1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast
0 and mlowast1 show that RV(m)
AC1is less sensitive than
RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of
RV(m)AC1
to that of (the optimal) RV(mlowast0) and the relative RMSE of
RV(m) to that of (the optimal) RV(mlowast
1)
AC1 These panels show that
the RV(m)AC1
continues to dominate the ldquooptimalrdquo RV(mlowast0) for a
wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast
1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast
1 are not known withcertainty The result shows that a reasonably precise estimate
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 131
Figure 1 Volatility Signature Plots for RV t Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and TransactionPrices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling in contrastto the bottom rows that are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4
The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents
an approximate 95 confidence interval for the average volatility
132 Journal of Business amp Economic Statistics April 2006
as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)
Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)
t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)
ACNW30 defined in Section 42 The shaded area about
σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B
From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)
A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise
Fact I The noise is negatively correlated with the efficientreturns
Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa
m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm
is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From
cov(ylowastim emidim )= 1
2cov(ylowastim eask
im)+ 1
2cov(ylowastim ebid
im)
we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data
Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened
A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1
2 )2 + ( 12 )2 versus 12] Such dis-
crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias
Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM
3 THE CASE WITH INDEPENDENT NOISE
In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5
We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption
Assumption 3 The noise process satisfies the following
(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t
Hansen and Lunde Realized Variance and Market Microstructure Noise 133
Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA
(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t
The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes
referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise
134 Journal of Business amp Economic Statistics April 2006
Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1
The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)
Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen
var(RV(m)
)= κ12ω4m+ 8ω2msum
i=1
σ 2im
minus (6κ minus 2)ω4 + 2msum
i=1
σ 4im (2)
and
RV(m) minus 2mω2
radicκ12ω4m
=radic
m
3κ
(RV(m)
2mω2minus 1
)drarr N(01)
as m rarrinfin
Here we ldquodrarrrdquo to denote convergence in distribution Thus
unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)
is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim
In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that
var(RV(m)
)= 2msum
i=1
σ 4im = 2
bminus a
m
int b
aσ 4(s)ds+ o
(1
m
)
whereint b
a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)
Next we consider the estimator of Zhou (1996) given by
RV(m)AC1
equivmsum
i=1
y2im +
msumi=1
yimyiminus1m +msum
i=1
yimyi+1m (3)
This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in
much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1
i=2 y2im +
summi=2 yimyiminus1m +summminus1
i=1 yimyi+1m that estimatesint bminusδmma+δ1m
σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)
and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5
Next we formulate results for RV(m)AC1
that are similar to thosefor RV(m) in Lemma 2
Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)
AC1)= IV If Assumption 3(b) also holds then
var(RV(m)
AC1
)= 8ω4m+ 8ω2msum
i=1
σ 2im
minus 6ω4 + 6msum
i=1
σ 4im +O(mminus2)
under CTS and BTS and
RV(m)AC1
minus IVradic8ω4m
drarr N(01) as m rarrinfin
An important result of Lemma 3 is that RV(m)AC1
is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)
AC1can be rewrit-
ten in a way that does not involve squared noise terms u2im
i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)
AC1 has a smaller asymptotic vari-
ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4
vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)
AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges
in distribution to N(01) and simply write RV(m)AC1
radic
8ω4mdrarr
N(01) In other words whereas RV(m)AC1
is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin
Hansen and Lunde Realized Variance and Market Microstructure Noise 135
In the absence of market microstructure noise (ω2 = 0) wenote that
var[RV(m)
AC1
]asymp 6msum
i=1
σ 4im
which shows that the variance of RV(m)AC1
is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)
It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem
minτ1τm
msumi=1
τ 2i subject to
msumi=1
τi = c
Thus if we set τi = σ 2im and c = IV then we see that
summi=1 σ 4
im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS
summi=1 σ 4
im = IV2mand specifically it holds that
IV2
mle
int b
aσ 4(s)ds
bminus a
m
The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by
var[RV(m)
AC1
]asymp 8ω4m+ 8ω2int b
aσ 2(s)ds
minus 6ω4 + 6bminus a
m
int b
aσ 4(s)ds
Next we compare RV(m)AC1
and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators
Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2
im = IVm (BTS) The MSEs aregiven by
MSE(RV(m)
)= IV2[
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21
m
](4)
and
MSE(RV(m)
AC1
)= IV2[
8λ2m+ 8λminus 6λ2 + 61
mminus2
1
m2
] (5)
The optimal sampling frequencies for RV(m) and RV(m)AC1
are
given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0
We denote the optimal sampling frequencies for RV(m) andRV(m)
AC1by mlowast
0 and mlowast1 These are approximately given by
mlowast0 asymp (2λ)minus23 and mlowast
1 asympradic
3(2λ)minus1
The expression for mlowast0 was derived in Bandi and Russell (2005)
and Zhang et al (2005) under more general conditions than
those used in Corollary 2 whereas the expression for mlowast1 was
derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such
that
mlowast1mlowast
0 asymp 3122minus13(λminus1)13 ge 10
which shows that mlowast1 is several times larger than mlowast
0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)
AC1permits more frequent sampling than does
the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1
canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast
1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies
Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)
AC1in terms
of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)
AC1and for analyzing their (lack of )
robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and
RV(m)AC1
are proportional to the IV and given by r0(λm)IV andr1(λm)IV where
r0(λm)equivradic
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2
m
and
r1(λm)equivradic
8λ2m+ 8λminus 6λ2 + 6
mminus 2
m2
Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)
AC1dominates the RV(m) except at
the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast
0 and mlowast1 For the AA returns we find that the opti-
mal sampling frequencies are mlowast0AA = 44 and mlowast
1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast
0 and mlowast1 show that RV(m)
AC1is less sensitive than
RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of
RV(m)AC1
to that of (the optimal) RV(mlowast0) and the relative RMSE of
RV(m) to that of (the optimal) RV(mlowast
1)
AC1 These panels show that
the RV(m)AC1
continues to dominate the ldquooptimalrdquo RV(mlowast0) for a
wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast
1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast
1 are not known withcertainty The result shows that a reasonably precise estimate
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
132 Journal of Business amp Economic Statistics April 2006
as a function of the sampling frequencies m where the averageis taken over multiple periods (typically trading days)
Figure 1 presents volatility signature plots for AA (left) andMSFT (right) using both CTS (rows 1 and 2) and TTS (rows3 and 4) and based on both transaction data and quotationdata The signature plots are based on daily RVs from theyears 2000 (rows 1 and 3) and 2004 (rows 2 and 4) whereRV(m)
t is calculated from intraday returns spanning the period930 AM to 1600 PM (the hours that the exchanges are open)The horizontal line represents an estimate of the average IVσ 2 equiv RV(1 tick)
ACNW30 defined in Section 42 The shaded area about
σ 2 represents an approximate 95 confidence interval for theaverage volatility These confidence intervals are computed us-ing a method described in Appendix B
From Figure 1 we see that the RVs based on low and mod-erate frequencies appear to be approximately unbiased How-ever at higher frequencies the RV becomes unreliable and themarket microstructure effects are pronounced at the ultra-highfrequencies particularly for transaction prices For exampleRV(1 sec) is about 47 for MSFT in 2000 whereas RV(1 min) ismuch smaller (about 60)
A very important result of Figure 1 is that the volatility sig-nature plots for mid-quotes drop (rather than increases) as thesampling frequency increases (as δim rarr 0) This holds for bothCTS and TTS Thus these volatility signature plots providethe first piece of evidence for the ugly facts about market mi-crostructure noise
Fact I The noise is negatively correlated with the efficientreturns
Our theoretical results show that ρm must be responsible forthe negative bias of RV(m) The other bias term 2m[π(0) minusπ( bminusa
m )] is always nonnegative such that time dependence inthe noise process cannot (by itself ) explain the negative biasseen in the volatility signature plots for mid-quotes So Fig-ure 1 strongly suggests that the innovations in the noise processeim are negatively correlated with the efficient returns ylowastimAlthough this phenomenon is most evident for mid-quotes itis quite plausible that the efficient return is also correlated witheach of the noise processes embedded in the three other priceseries bid ask and transaction prices At this point it is worthrecalling Colin Sautarrsquos words ldquoJust because yoursquore not para-noid doesnrsquot mean theyrsquore not out to get yourdquo Similarly justbecause we cannot see a negative bias does not mean that ρm
is 0 In fact if ρm gt 0 then it would not be exposed in a simplemanner in a volatility signature plot From
cov(ylowastim emidim )= 1
2cov(ylowastim eask
im)+ 1
2cov(ylowastim ebid
im)
we see that the noise in bid andor ask quotes must be correlatedwith the efficient prices if the noise in mid-quotes is found tobe correlated with the efficient price In Section 6 we presentadditional evidence of this correlation which is also found fortransaction data
Nonsynchronous revisions of bid and ask quotes when theefficient price changes is a possible explanation for the nega-tive correlation between noise and efficient returns An upwardmovement in prices often causes the ask price to increase beforethe bid does whereby the bidndashask spread is temporary widened
A similar widening of the spread occurs when prices go downThis has implications for the quadratic variation of mid-quotesbecause a one-tick price increment is divided into two half-tickincrements resulting in quadratic terms that add up to only halfthat of the bid or ask price [( 1
2 )2 + ( 12 )2 versus 12] Such dis-
crete revisions of the observed price toward the effective pricehas been used in a very interesting framework by Large (2005)who showed that this may result in a negative bias
Figure 2 presents typical trading scenarios for AA dur-ing three 20-minute periods on April 24 2004 The prevail-ing bid and ask prices are given by the edges of the shadedarea and the dots represents actual transaction prices That thespread tends to get wider when prices move up or down isseen in many places such as the minutes after 1000 AM andaround 1215 PM
3 THE CASE WITH INDEPENDENT NOISE
In this section we analyze the special case where the noiseprocess is assumed to be of an independent type Our assump-tions which we make precise in Assumption 3 essentiallyamount to assuming that π(s) = 0 for all s = 0 and plowast perpperp uwhere we use ldquoperpperprdquo to denote stochastic independence Mostof the existing literature has established results assuming thiskind of noise and in this section we shall draw on several im-portant results from Zhou (1996) Bandi and Russell (2005)and Zhang et al (2005) Although we have already dismissedthis form of noise as an accurate description of the noise inour data there are several good arguments for analyzing theproperties of the RV and related quantities under this assump-tion The independent noise assumption makes the analysistractable and provides valuable insight into the issues related tomarket microstructure noise Furthermore although the inde-pendent noise assumption is inaccurate at ultra-high samplingfrequencies the implications of this assumptions may be validat lower sampling frequencies For example it may be reason-able to assume that the noise is independent when prices aresampled every minute On the other hand for some purposesthe independent noise assumption can be quite misleading aswe discuss in Section 5
We focus on a kernel estimator originally proposed by Zhou(1996) that incorporates the first-order autocovariance A sim-ilar estimator was applied to daily return series by FrenchSchwert and Stambaugh (1987) Our use of this estimator hasthree purposes First we compare this simple bias-correctedversion of the realized variance to the standard measure ofthe realized variance and find that these results are gener-ally quite favorable to the bias-corrected estimator Secondour analysis makes it possible to quantify the accuracy of re-sults based on no-noise assumptions such as the asymptoticresults by Jacod (1994) Jacod and Protter (1998) Barndorff-Nielsen and Shephard (2002) and Mykland and Zhang (2006)and to evaluate whether the bias-corrected estimator is less sen-sitive to market microstructure noise Finally we use the bias-corrected estimator to analyze the validity of the independentnoise assumption
Assumption 3 The noise process satisfies the following
(a) plowast perpperp u u(s) perpperp u(t) for all s = t and E[u(t)] = 0 forall t
Hansen and Lunde Realized Variance and Market Microstructure Noise 133
Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA
(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t
The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes
referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise
134 Journal of Business amp Economic Statistics April 2006
Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1
The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)
Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen
var(RV(m)
)= κ12ω4m+ 8ω2msum
i=1
σ 2im
minus (6κ minus 2)ω4 + 2msum
i=1
σ 4im (2)
and
RV(m) minus 2mω2
radicκ12ω4m
=radic
m
3κ
(RV(m)
2mω2minus 1
)drarr N(01)
as m rarrinfin
Here we ldquodrarrrdquo to denote convergence in distribution Thus
unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)
is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim
In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that
var(RV(m)
)= 2msum
i=1
σ 4im = 2
bminus a
m
int b
aσ 4(s)ds+ o
(1
m
)
whereint b
a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)
Next we consider the estimator of Zhou (1996) given by
RV(m)AC1
equivmsum
i=1
y2im +
msumi=1
yimyiminus1m +msum
i=1
yimyi+1m (3)
This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in
much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1
i=2 y2im +
summi=2 yimyiminus1m +summminus1
i=1 yimyi+1m that estimatesint bminusδmma+δ1m
σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)
and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5
Next we formulate results for RV(m)AC1
that are similar to thosefor RV(m) in Lemma 2
Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)
AC1)= IV If Assumption 3(b) also holds then
var(RV(m)
AC1
)= 8ω4m+ 8ω2msum
i=1
σ 2im
minus 6ω4 + 6msum
i=1
σ 4im +O(mminus2)
under CTS and BTS and
RV(m)AC1
minus IVradic8ω4m
drarr N(01) as m rarrinfin
An important result of Lemma 3 is that RV(m)AC1
is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)
AC1can be rewrit-
ten in a way that does not involve squared noise terms u2im
i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)
AC1 has a smaller asymptotic vari-
ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4
vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)
AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges
in distribution to N(01) and simply write RV(m)AC1
radic
8ω4mdrarr
N(01) In other words whereas RV(m)AC1
is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin
Hansen and Lunde Realized Variance and Market Microstructure Noise 135
In the absence of market microstructure noise (ω2 = 0) wenote that
var[RV(m)
AC1
]asymp 6msum
i=1
σ 4im
which shows that the variance of RV(m)AC1
is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)
It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem
minτ1τm
msumi=1
τ 2i subject to
msumi=1
τi = c
Thus if we set τi = σ 2im and c = IV then we see that
summi=1 σ 4
im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS
summi=1 σ 4
im = IV2mand specifically it holds that
IV2
mle
int b
aσ 4(s)ds
bminus a
m
The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by
var[RV(m)
AC1
]asymp 8ω4m+ 8ω2int b
aσ 2(s)ds
minus 6ω4 + 6bminus a
m
int b
aσ 4(s)ds
Next we compare RV(m)AC1
and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators
Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2
im = IVm (BTS) The MSEs aregiven by
MSE(RV(m)
)= IV2[
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21
m
](4)
and
MSE(RV(m)
AC1
)= IV2[
8λ2m+ 8λminus 6λ2 + 61
mminus2
1
m2
] (5)
The optimal sampling frequencies for RV(m) and RV(m)AC1
are
given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0
We denote the optimal sampling frequencies for RV(m) andRV(m)
AC1by mlowast
0 and mlowast1 These are approximately given by
mlowast0 asymp (2λ)minus23 and mlowast
1 asympradic
3(2λ)minus1
The expression for mlowast0 was derived in Bandi and Russell (2005)
and Zhang et al (2005) under more general conditions than
those used in Corollary 2 whereas the expression for mlowast1 was
derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such
that
mlowast1mlowast
0 asymp 3122minus13(λminus1)13 ge 10
which shows that mlowast1 is several times larger than mlowast
0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)
AC1permits more frequent sampling than does
the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1
canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast
1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies
Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)
AC1in terms
of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)
AC1and for analyzing their (lack of )
robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and
RV(m)AC1
are proportional to the IV and given by r0(λm)IV andr1(λm)IV where
r0(λm)equivradic
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2
m
and
r1(λm)equivradic
8λ2m+ 8λminus 6λ2 + 6
mminus 2
m2
Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)
AC1dominates the RV(m) except at
the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast
0 and mlowast1 For the AA returns we find that the opti-
mal sampling frequencies are mlowast0AA = 44 and mlowast
1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast
0 and mlowast1 show that RV(m)
AC1is less sensitive than
RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of
RV(m)AC1
to that of (the optimal) RV(mlowast0) and the relative RMSE of
RV(m) to that of (the optimal) RV(mlowast
1)
AC1 These panels show that
the RV(m)AC1
continues to dominate the ldquooptimalrdquo RV(mlowast0) for a
wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast
1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast
1 are not known withcertainty The result shows that a reasonably precise estimate
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 133
Figure 2 Bid and Ask Quotes (defined by the shaded area) and Actual Transaction Prices ( ) Over Three 20-Minute Subperiods on April 242004 for AA
(b) ω2 equiv E|u(t)|2 ltinfin for all t(c) micro4 equiv E|u(t)|4 ltinfin for all t
The independent noise u induces an MA(1) structure on thereturn noise eim which is why this type of noise is sometimes
referred to as MA(1) noise However eim has a very particularMA(1) structure because it has a unit root Thus the MA(1)label does not fully characterize the properties of the noiseThis is why we prefer to call this type of noise independentnoise
134 Journal of Business amp Economic Statistics April 2006
Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1
The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)
Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen
var(RV(m)
)= κ12ω4m+ 8ω2msum
i=1
σ 2im
minus (6κ minus 2)ω4 + 2msum
i=1
σ 4im (2)
and
RV(m) minus 2mω2
radicκ12ω4m
=radic
m
3κ
(RV(m)
2mω2minus 1
)drarr N(01)
as m rarrinfin
Here we ldquodrarrrdquo to denote convergence in distribution Thus
unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)
is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim
In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that
var(RV(m)
)= 2msum
i=1
σ 4im = 2
bminus a
m
int b
aσ 4(s)ds+ o
(1
m
)
whereint b
a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)
Next we consider the estimator of Zhou (1996) given by
RV(m)AC1
equivmsum
i=1
y2im +
msumi=1
yimyiminus1m +msum
i=1
yimyi+1m (3)
This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in
much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1
i=2 y2im +
summi=2 yimyiminus1m +summminus1
i=1 yimyi+1m that estimatesint bminusδmma+δ1m
σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)
and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5
Next we formulate results for RV(m)AC1
that are similar to thosefor RV(m) in Lemma 2
Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)
AC1)= IV If Assumption 3(b) also holds then
var(RV(m)
AC1
)= 8ω4m+ 8ω2msum
i=1
σ 2im
minus 6ω4 + 6msum
i=1
σ 4im +O(mminus2)
under CTS and BTS and
RV(m)AC1
minus IVradic8ω4m
drarr N(01) as m rarrinfin
An important result of Lemma 3 is that RV(m)AC1
is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)
AC1can be rewrit-
ten in a way that does not involve squared noise terms u2im
i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)
AC1 has a smaller asymptotic vari-
ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4
vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)
AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges
in distribution to N(01) and simply write RV(m)AC1
radic
8ω4mdrarr
N(01) In other words whereas RV(m)AC1
is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin
Hansen and Lunde Realized Variance and Market Microstructure Noise 135
In the absence of market microstructure noise (ω2 = 0) wenote that
var[RV(m)
AC1
]asymp 6msum
i=1
σ 4im
which shows that the variance of RV(m)AC1
is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)
It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem
minτ1τm
msumi=1
τ 2i subject to
msumi=1
τi = c
Thus if we set τi = σ 2im and c = IV then we see that
summi=1 σ 4
im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS
summi=1 σ 4
im = IV2mand specifically it holds that
IV2
mle
int b
aσ 4(s)ds
bminus a
m
The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by
var[RV(m)
AC1
]asymp 8ω4m+ 8ω2int b
aσ 2(s)ds
minus 6ω4 + 6bminus a
m
int b
aσ 4(s)ds
Next we compare RV(m)AC1
and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators
Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2
im = IVm (BTS) The MSEs aregiven by
MSE(RV(m)
)= IV2[
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21
m
](4)
and
MSE(RV(m)
AC1
)= IV2[
8λ2m+ 8λminus 6λ2 + 61
mminus2
1
m2
] (5)
The optimal sampling frequencies for RV(m) and RV(m)AC1
are
given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0
We denote the optimal sampling frequencies for RV(m) andRV(m)
AC1by mlowast
0 and mlowast1 These are approximately given by
mlowast0 asymp (2λ)minus23 and mlowast
1 asympradic
3(2λ)minus1
The expression for mlowast0 was derived in Bandi and Russell (2005)
and Zhang et al (2005) under more general conditions than
those used in Corollary 2 whereas the expression for mlowast1 was
derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such
that
mlowast1mlowast
0 asymp 3122minus13(λminus1)13 ge 10
which shows that mlowast1 is several times larger than mlowast
0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)
AC1permits more frequent sampling than does
the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1
canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast
1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies
Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)
AC1in terms
of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)
AC1and for analyzing their (lack of )
robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and
RV(m)AC1
are proportional to the IV and given by r0(λm)IV andr1(λm)IV where
r0(λm)equivradic
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2
m
and
r1(λm)equivradic
8λ2m+ 8λminus 6λ2 + 6
mminus 2
m2
Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)
AC1dominates the RV(m) except at
the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast
0 and mlowast1 For the AA returns we find that the opti-
mal sampling frequencies are mlowast0AA = 44 and mlowast
1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast
0 and mlowast1 show that RV(m)
AC1is less sensitive than
RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of
RV(m)AC1
to that of (the optimal) RV(mlowast0) and the relative RMSE of
RV(m) to that of (the optimal) RV(mlowast
1)
AC1 These panels show that
the RV(m)AC1
continues to dominate the ldquooptimalrdquo RV(mlowast0) for a
wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast
1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast
1 are not known withcertainty The result shows that a reasonably precise estimate
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
134 Journal of Business amp Economic Statistics April 2006
Some of the results that we formulate in this section onlyrely on Assumption 3(a) so we require only (b) and (c) to holdwhen necessary Note that ω2 which is defined in (b) corre-sponds to π(0) in our previous notation To simplify some ofour subsequent expressions we define the ldquoexcess kurtosis ra-tiordquo κ equiv micro4(3ω4) and note that Assumption 3 is satisfiedif u is a Gaussian ldquowhite noiserdquo process u(t) sim N(0ω2) inwhich case κ = 1
The existence of a noise process u that satisfies Assump-tion 3 follows directly from Kolmogorovrsquos existence theo-rem (see Billingsley 1995 chap 7) It is worthwhile to notethat ldquowhite noise processes in continuous timerdquo are very er-ratic processes In fact the quadratic variation of a white noiseprocess is unbounded (as is the r-tic variation for any other in-teger) Thus the ldquorealized variancerdquo of a white noise processdiverges to infinity in probability as the sampling frequencym is increased This is in stark contrast to the situation forBrownian-type processes that have finite r-tic variation forr ge 2 (see Barndorff-Nielsen and Shephard 2003)
Lemma 2 Given Assumptions 1 and 3(a) and (b) we havethat E(RV(m)) = IV + 2mω2 if Assumption 3(c) also holdsthen
var(RV(m)
)= κ12ω4m+ 8ω2msum
i=1
σ 2im
minus (6κ minus 2)ω4 + 2msum
i=1
σ 4im (2)
and
RV(m) minus 2mω2
radicκ12ω4m
=radic
m
3κ
(RV(m)
2mω2minus 1
)drarr N(01)
as m rarrinfin
Here we ldquodrarrrdquo to denote convergence in distribution Thus
unlike the situation in Corollary 1 where the noise is time-dependent and the asymptotic bias is finite [whenever π prime(0)
is finite] this situation with independent market microstructurenoise leads to a bias that diverges to infinity This result was firstderived in an unpublished thesis by Fang (1996) The expres-sion for the variance [see (2)] is due to Bandi and Russell (2005)and Zhang et al (2005) the former expressed (2) in terms of themoments of the return noise eim
In the absence of market microstructure noise and underCTS [ω2 = 0 and δim = (b minus a)m] we recognize a result ofBarndorff-Nielsen and Shephard (2002) that
var(RV(m)
)= 2msum
i=1
σ 4im = 2
bminus a
m
int b
aσ 4(s)ds+ o
(1
m
)
whereint b
a σ 4(s)ds is known as the integrated quarticity intro-duced by Barndorff-Nielsen and Shephard (2002)
Next we consider the estimator of Zhou (1996) given by
RV(m)AC1
equivmsum
i=1
y2im +
msumi=1
yimyiminus1m +msum
i=1
yimyi+1m (3)
This estimator incorporates the empirical first-order autoco-variance which amounts to a bias correction that ldquoworksrdquo in
much the same way that robust covariance estimators suchas that of Newey and West (1987) achieve their consistencyNote that (3) involves y0m and ym+1m which are intradayreturns outside the interval [ab] If these two intraday re-turns are unavailable then one could simply use the estimatorsummminus1
i=2 y2im +
summi=2 yimyiminus1m +summminus1
i=1 yimyi+1m that estimatesint bminusδmma+δ1m
σ 2(s)ds = IV + O( 1m ) Here we follow Zhou (1996)
and use the formulation in (3) because it simplifies the analysisand several expressions Our empirical implementation is basedon a version that does not rely on intraday returns outside the[ab] interval We describe the exact implementation in Sec-tion 5
Next we formulate results for RV(m)AC1
that are similar to thosefor RV(m) in Lemma 2
Lemma 3 Given Assumptions 1 and 3(a) we have thatE(RV(m)
AC1)= IV If Assumption 3(b) also holds then
var(RV(m)
AC1
)= 8ω4m+ 8ω2msum
i=1
σ 2im
minus 6ω4 + 6msum
i=1
σ 4im +O(mminus2)
under CTS and BTS and
RV(m)AC1
minus IVradic8ω4m
drarr N(01) as m rarrinfin
An important result of Lemma 3 is that RV(m)AC1
is unbiased forthe IV at any sampling frequency m Also note that Lemma 3requires slightly weaker assumptions than those needed forRV(m) in Lemma 2 The first result relies on only Assump-tion 3(a) (c) is not needed for the variance expression Thisis achieved because the expression for RV(m)
AC1can be rewrit-
ten in a way that does not involve squared noise terms u2im
i = 1 m as does the expression for RV(m) where uim equivu(tim) A somewhat remarkable result of Lemma 3 is that thebias-corrected estimator RV(m)
AC1 has a smaller asymptotic vari-
ance (as m rarrinfin) than the unadjusted estimator RV(m) (8mω4
vs 12κmω4) Usually bias correction is accompanied by alarger asymptotic variance Also note that the asymptotic re-sults of Lemma 3 are somewhat more useful than those ofLemma 2 (in terms of estimating IV) because the results ofLemma 2 do not involve the object of interest IV but shedlight only on aspects of the noise process This property wasused by Bandi and Russell (2005) and Zhang et al (2005)to estimate ω2 we discuss this aspect in more detail in ourempirical analysis in Section 5 It is important to note thatthe asymptotic results of Lemma 3 do not suggest that RV(m)
AC1should be based on intraday returns sampled at the highest pos-sible frequency because the asymptotic variance is increasingin m Thus we could drop IV from the quantity that converges
in distribution to N(01) and simply write RV(m)AC1
radic
8ω4mdrarr
N(01) In other words whereas RV(m)AC1
is ldquocenteredrdquo aboutthe object of interest IV it is unlikely to be close to IV asm rarrinfin
Hansen and Lunde Realized Variance and Market Microstructure Noise 135
In the absence of market microstructure noise (ω2 = 0) wenote that
var[RV(m)
AC1
]asymp 6msum
i=1
σ 4im
which shows that the variance of RV(m)AC1
is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)
It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem
minτ1τm
msumi=1
τ 2i subject to
msumi=1
τi = c
Thus if we set τi = σ 2im and c = IV then we see that
summi=1 σ 4
im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS
summi=1 σ 4
im = IV2mand specifically it holds that
IV2
mle
int b
aσ 4(s)ds
bminus a
m
The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by
var[RV(m)
AC1
]asymp 8ω4m+ 8ω2int b
aσ 2(s)ds
minus 6ω4 + 6bminus a
m
int b
aσ 4(s)ds
Next we compare RV(m)AC1
and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators
Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2
im = IVm (BTS) The MSEs aregiven by
MSE(RV(m)
)= IV2[
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21
m
](4)
and
MSE(RV(m)
AC1
)= IV2[
8λ2m+ 8λminus 6λ2 + 61
mminus2
1
m2
] (5)
The optimal sampling frequencies for RV(m) and RV(m)AC1
are
given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0
We denote the optimal sampling frequencies for RV(m) andRV(m)
AC1by mlowast
0 and mlowast1 These are approximately given by
mlowast0 asymp (2λ)minus23 and mlowast
1 asympradic
3(2λ)minus1
The expression for mlowast0 was derived in Bandi and Russell (2005)
and Zhang et al (2005) under more general conditions than
those used in Corollary 2 whereas the expression for mlowast1 was
derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such
that
mlowast1mlowast
0 asymp 3122minus13(λminus1)13 ge 10
which shows that mlowast1 is several times larger than mlowast
0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)
AC1permits more frequent sampling than does
the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1
canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast
1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies
Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)
AC1in terms
of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)
AC1and for analyzing their (lack of )
robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and
RV(m)AC1
are proportional to the IV and given by r0(λm)IV andr1(λm)IV where
r0(λm)equivradic
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2
m
and
r1(λm)equivradic
8λ2m+ 8λminus 6λ2 + 6
mminus 2
m2
Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)
AC1dominates the RV(m) except at
the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast
0 and mlowast1 For the AA returns we find that the opti-
mal sampling frequencies are mlowast0AA = 44 and mlowast
1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast
0 and mlowast1 show that RV(m)
AC1is less sensitive than
RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of
RV(m)AC1
to that of (the optimal) RV(mlowast0) and the relative RMSE of
RV(m) to that of (the optimal) RV(mlowast
1)
AC1 These panels show that
the RV(m)AC1
continues to dominate the ldquooptimalrdquo RV(mlowast0) for a
wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast
1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast
1 are not known withcertainty The result shows that a reasonably precise estimate
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 135
In the absence of market microstructure noise (ω2 = 0) wenote that
var[RV(m)
AC1
]asymp 6msum
i=1
σ 4im
which shows that the variance of RV(m)AC1
is about three timeslarger than that of RV(m) when ω2 = 0 Thus in the absence ofnoise we see an increase in the asymptotic variance as a resultof the bias correction Interestingly this increase in the varianceis identical to that of the maximum likelihood estimator in aGaussian specification where σ 2(s) is constant and ω2 = 0 (seeAiumlt-Sahalia et al 2005a)
It is easy to show that τ lowasti = cm i = 1 m solves thefollowing constrained minimization problem
minτ1τm
msumi=1
τ 2i subject to
msumi=1
τi = c
Thus if we set τi = σ 2im and c = IV then we see that
summi=1 σ 4
im(for fixed m) is minimized under BTS This highlights one ofthe advantages of BTS over CTS This result was shown to holdin a related (pure jump) framework by Oomen (2005) In thepresent context we have that under BTS
summi=1 σ 4
im = IV2mand specifically it holds that
IV2
mle
int b
aσ 4(s)ds
bminus a
m
The variance expression under CTS [δim = (b minus a)m] is ap-proximately given by
var[RV(m)
AC1
]asymp 8ω4m+ 8ω2int b
aσ 2(s)ds
minus 6ω4 + 6bminus a
m
int b
aσ 4(s)ds
Next we compare RV(m)AC1
and RV(m) in terms of their MSEsand their respective optimal sampling frequencies for a specialcase that reveals key features of the two estimators
Corollary 2 Define λ equiv ω2IV suppose that κ = 1 and lett0m tmm be such that σ 2
im = IVm (BTS) The MSEs aregiven by
MSE(RV(m)
)= IV2[
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 21
m
](4)
and
MSE(RV(m)
AC1
)= IV2[
8λ2m+ 8λminus 6λ2 + 61
mminus2
1
m2
] (5)
The optimal sampling frequencies for RV(m) and RV(m)AC1
are
given implicitly as the real (positive) solutions to 4λ2m3 +6λ2m2 minus 1 = 0 and 4λ2m3 minus 3m+ 2 = 0
We denote the optimal sampling frequencies for RV(m) andRV(m)
AC1by mlowast
0 and mlowast1 These are approximately given by
mlowast0 asymp (2λ)minus23 and mlowast
1 asympradic
3(2λ)minus1
The expression for mlowast0 was derived in Bandi and Russell (2005)
and Zhang et al (2005) under more general conditions than
those used in Corollary 2 whereas the expression for mlowast1 was
derived earlier by Zhou (1996)In our empirical analysis we often find that λ le 10minus3 such
that
mlowast1mlowast
0 asymp 3122minus13(λminus1)13 ge 10
which shows that mlowast1 is several times larger than mlowast
0 when thenoise-to-signal is as small as we find it to be in practice Inother words RV(m)
AC1permits more frequent sampling than does
the ldquooptimalrdquo RV This is quite intuitive because RV(m)AC1
canuse more information in the data without being affected by asevere bias Naturally when TTS is used the number of in-traday returns m cannot exceed the total number of trans-actionsquotations so in practice it might not be possible tosample as frequently as prescribed by mlowast
1 Furthermore theseresults rely on the independent noise assumption which maynot hold at the highest sampling frequencies
Corollary 2 captures the salient features of this problem andcharacterizes the MSE properties of RV(m) and RV(m)
AC1in terms
of a single parameter λ (noise-to-signal) Thus the simplifyingassumptions of Corollary 2 yield an attractive framework forcomparing RV(m) and RV(m)
AC1and for analyzing their (lack of )
robustness to market microstructure noiseFrom Corollary 2 we note that the RMSEs of RV(m) and
RV(m)AC1
are proportional to the IV and given by r0(λm)IV andr1(λm)IV where
r0(λm)equivradic
4λ2m2 + 12λ2m+ 8λminus 4λ2 + 2
m
and
r1(λm)equivradic
8λ2m+ 8λminus 6λ2 + 6
mminus 2
m2
Figure 3 plots r0(λm) and r1(λm) using empirical estimatesof λ The estimates are based on high-frequency stock returnsof Alcoa (left panels) and Microsoft (right panels) in year the2000 The details about the estimation of λ are deferred to Sec-tion 5 The upper panels present r0(λm) and r1(λm) wherethe x-axis is δim = (b minus a)m in units of seconds For both eq-uities we note that the RV(m)
AC1dominates the RV(m) except at
the very lowest frequencies The minimums of r0(λm) andr1(λm) identify their respective optimal sampling frequen-cies mlowast
0 and mlowast1 For the AA returns we find that the opti-
mal sampling frequencies are mlowast0AA = 44 and mlowast
1AA = 511(corresponding to intraday returns spanning 9 minutes and46 seconds) and that the theoretical reduction of the RMSE is331 The curvatures of r0(λm) and r1(λm) in the neigh-borhood of mlowast
0 and mlowast1 show that RV(m)
AC1is less sensitive than
RV(m) to the choice of mThe middle panels of Figure 3 display the relative RMSE of
RV(m)AC1
to that of (the optimal) RV(mlowast0) and the relative RMSE of
RV(m) to that of (the optimal) RV(mlowast
1)
AC1 These panels show that
the RV(m)AC1
continues to dominate the ldquooptimalrdquo RV(mlowast0) for a
wide ranges of frequencies not just in a small neighborhood ofthe optimal value mlowast
1 This robustness of RVAC1 is quite use-ful in practice where λ and (hence) mlowast
1 are not known withcertainty The result shows that a reasonably precise estimate
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
136 Journal of Business amp Economic Statistics April 2006
Figure 3 RMSE Properties of RV and RV AC1Under Independent Market Microstructure Noise Using Empirical Estimates of λ for 2000 The up-
per panels display the RMSEs for RV and RV AC1using estimates of λ r0(λ m) and r1(λm) and the corresponding RMSEs in the absence of noise
r0(0 m) and r1(0 m) The middle panels are the relative RMSEs of RV(m) and RV(m)AC1
to RV (mlowast0 ) and RV
(mlowast1)
AC1 as defined by r0(λ m)r1(λ mlowast
1) and
r1(λ m)r0(λ mlowast0) The lower panels show the percentage increase in the RMSE for different sampling frequencies caused by market microstructure
noise The x-axis gives the sampling frequency of intraday returns as defined by δi m = (b minus a)m in units of seconds where b minus a = 65 hours(a trading day)
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 137
of λ (and hence mlowast1) will lead to a RVAC1 that dominates RV
This result is not surprising because recent developments inthis literature have shown that it is possible to construct kernel-based estimators that are even more accurate than RVAC1 (seeBarndorff-Nielsen et al 2004 Zhang 2004)
A second very interesting aspect that can be analyzed basedon the results of Corollary 2 is the accuracy of theoretical re-sults derived under the assumption that λ = 0 (no market mi-crostructure noise) For example the accuracy of a confidenceinterval for IV which is based on asymptotic results that ignorethe presence of noise will depend on λ and m The expres-sions of Corollary 2 provide a simple way to quantify the the-oretical accuracy of such confidence intervals including thoseof Barndorff-Nielsen and Shephard (2002) Figure 3 providesvaluable information on this question The upper panels of Fig-ure 3 present the RMSEs of RV(m) and RV(m)
AC1 using both
λ gt 0 (the case with noise) and λ= 0 (the case without noise)For small values of m we see that r0(λm) asymp r0(0m) andr1(λm) asymp r1(0m) whereas the effects of market microstruc-ture noise are pronounced at the higher sampling frequenciesThe lower panels of Figure 3 quantify the discrepancy betweenthe two ldquotypesrdquo of RMSEs as a function of the sampling fre-quency These plots present 100[r0(λm) minus r0(0m)]r0(0m)
and 100[r1(λm)minus r1(0m)]r1(0m) as a function of m Thusthe former reveals the percentage increase of the RVrsquos RMSEdue to market microstructure noise and the second line simi-larly shows the increase of the RVAC1 rsquos RMSE due to noiseThe increase in the RMSE may be translated into a widening ofa confidence intervals for IV (about RV(m) or RV(m)
AC1) The ver-
tical lines in the right panels mark the sampling frequency cor-responding to 5-minute sampling under CTS and show that theldquoactualrdquo confidence interval (based on RV(m)) is 10594 largerthan the ldquono-noiserdquo confidence interval for AA whereas the en-largement is 2237 for MSFT At 20-minute sampling the dis-crepancy is less than a couple of percent so in this case thesize distortion from being oblivious to market microstructurenoise is quite small The corresponding increases in the RMSEof RV(m)
AC1are 941 and 307 Thus a ldquono-noiserdquo confidence
interval about RV(m)AC1
is more reliable than that about RV(m) atmoderate sampling frequencies Here we have used an estima-tor of λ based on data from the year 2000 before the tick sizewas reduced to 1 cent In our empirical analysis we find thenoise to be much smaller in recent years such that ldquono-noiseapproximationsrdquo are likely to be more accurate after decimal-ization of the tick size
Figure 4 presents the volatility signature plots for RV(m)AC1
where we have used the same scale as in Figure 1 When sam-pling in calender time (the four upper panels) we see a pro-nounced bias in RV(m)
AC1when intraday returns are sampled more
frequently than every 30 seconds The main explanation for thisis that CTS will sample the same price multiple times when m islarge which induces (artificial) autocorrelation in intraday re-turns Thus when intraday returns are based on CTS it is nec-essary to incorporate higher-order autocovariances of yim whenm becomes large The plots in rows 3 and 4 are signature plotswhen intraday returns are sampled in tick time These also re-veal a bias in RV(x tick)
AC1at the highest frequencies which shows
that the noise is time dependent in tick time For example the
MSFT 2000 plot suggests that the time dependence lasts for30 ticks perhaps longer
Fact II The noise is autocorrelated
We provide additional evidence of this fact based on otherempirical quantities in the following sections
4 THE CASE WITH DEPENDENT NOISE
In this section we consider the case where the noise istime-dependent and possibly correlated with the efficient re-turns ylowastim Following earlier versions of the present articleissues related to time dependence and noisendashprice correlationhave been addressed by others including Aiumlt-Sahalia et al(2005b) Frijns and Lehnert (2004) and Zhang (2004) The timescale of the dependence in the noise plays a role in the asymp-totic analysis Although the ldquoclockrdquo at which the memory in thenoise decays can follow any time scale it seems reasonable forit to be tied to calendar time tick time or a combination of thetwo We first consider a situation where the time dependenceis specific to calendar time then consider the case with timedependence in tick time
41 Dependence in Calender Time
To bias-correct the RV under the general time-dependenttype of noise we make the following assumption about the timedependence in the noise process
Assumption 4 The noise process has finite dependence inthe sense that π(s)= 0 for all s gt θ0 for some finite θ0 ge 0 andE[u(t)|plowast(s)] = 0 for all |t minus s|gt θ0
The assumption is trivially satisfied under the independentnoise assumption used in Section 3 A more interesting class ofnoise processes with finite dependence are those of the mov-ing average type u(t) = int t
tminusθ0ψ(t minus s)dB(s) where B(s) rep-
resents a standard Brownian motion and ψ(s) is a bounded(nonrandom) function on [0 θ0] The autocorrelation functionfor a process of this kind is given by π(s)= int θ0
s ψ(t)ψ(tminus s)dtfor s isin [0 θ0]
Theorem 2 Suppose that Assumptions 1 2 and 4 hold andlet qm be such that qmm gt θ0 Then (under CTS)
E(RV(m)
ACqmminus IV
)= 0
where
RV(m)ACqm
equivmsum
i=1
y2im +
qmsumh=1
msumi=1
(yiminushmyim + yimyi+hm)
A drawback of RV(m)ACqm
is that it may produce a negativeestimate of volatility because the covariances are not scaleddownward in a way that would guarantee positivity This is par-ticularly relevant in the situation where intraday returns havea ldquosharp negative autocorrelationrdquo (see West 1997) which hasbeen observed in high-frequency intraday returns constructedfrom transaction prices To rule out the possibility of a negativeestimate one could use a different kernel such as the Bartlettkernel Although a different kernel may not be entirely unbi-
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
138 Journal of Business amp Economic Statistics April 2006
Figure 4 Volatility Signature Plots for RV AC1Based on Ask Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction
Prices ( ) The left column is for AA and the right column is for MSFT The two top rows are based on calendar time sampling the bot-tom rows are based on tick time sampling The results for 2000 are the panels in rows 1 and 3 and those for 2004 are in rows 2 and 4 Thehorizontal line represents an estimate of the average IV σ 2 equiv RV
(1 tick)ACNW30
that is defined in Section 42 The shaded area about σ 2 represents anapproximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 139
ased it may result is a smaller MSE than that of RVAC Inter-estingly Barndorff-Nielsen et al (2004) have shown that thesubsample estimator of Zhang et al (2005) is almost identicalto the Bartlett kernel estimator
In the time series literature the lag length qm is typicallychosen such that qmm rarr 0 as m rarr infin for example qm =4(m100)29 where x denotes the smallest integer that isgreater than or equal to x But if the noise were dependentin calendar time then this would be inappropriate because itwould lead to qm = 3 when a typical trading day (390 minutes)were divided into 78 intraday returns (5-minute returns) andto qm = 6 if the day were divided into 780 intraday returns(30-second returns) So the former q would cover 15 minuteswhereas the latter would cover 3 minutes (6times 30 seconds) andin fact the period would shrink to 0 as m rarr infin Under As-sumptions 2 and 4 the autocorrelation in intraday returns isspecific to a period in calendar time which does not dependon m thus it is more appropriate to keep the width of the ldquoau-tocorrelation windowrdquo qmm constant This also makes RV(m)
ACmore comparable across different frequencies m Thus we setqm = w
(bminusa)m where w is the desired width of the lag win-dow and b minus a is the length of the sampling period (both inunits of time) such that (b minus a)m is the period covered byeach intraday return In this case we write RV(m)
ACwin place of
RV(m)ACqm
Therefore if we were to sample in calendar time andset w = 15 min and b minus a = 390 min then we would includeqm = m26 autocovariance terms
When qm is such that qmm gt θ0 ge 0 this implies thatRV(m)
ACqmcannot be consistent for IV This property is common
for estimators of the long-run variance in the time series liter-ature whenever qmm does not converge to 0 sufficiently fast(see eg Kiefer Vogelsang and Bunzel 2000 and Jansson2004) The lack of consistency in the present context can be un-derstood without consideration of market microstructure noiseIn the absence of noise we have that var(y2
im) = 2σ 4im and
var(yimyi+hm)= σ 2imσ 2
i+hm asymp σ 4im such that
var[RV(m)
ACqm
] asymp 2msum
i=1
σ 4im +
qmsumh=1
(2)2msum
i=1
σ 4im
= 2(1+ 2qm)
msumi=1
σ 4im
which approximately equals
2(1+ 2qm)bminus a
m
int 1
0σ 4(s)ds
under CTS This shows that the variance does not vanish whenqm is such that qmm gt θ0 gt 0
The upper four panels of Figure 5 represent a new type ofsignature plots for RV(1 sec)
ACq Here we sample intraday returns
every second using the previous-tick method and now plot qalong the x-axis Thus these signature plots provide informa-tion on time dependence in the noise process The fact that theRV(1 sec)
ACqof the four price series differ and have not leveled off
is evidence of time dependence Thus in the upper four panelswhere we sample in calendar time it appears that the depen-dence lasts for as long as 2 minutes (AA year 2000) or as short
as 15 seconds (MSFT year 2004) We comment on the lowerfour panels in the next section where we discuss intraday re-turns sampled in tick time
42 Time Dependence in Tick Time
When sampling at ultra-high frequencies we find it more nat-ural to sample in tick time such that the same observation is notsampled multiple times Furthermore the time dependence inthe noise process may be in tick time rather than calender timeSeveral results of Bandi and Russell (2005) allow for time de-pendence in tick time (while the pricendashnoise correlation is as-sumed away)
The following example gives a situation with market mi-crostructure noise that is time-dependent in tick time and corre-lated with efficient returns
Example 1 Let t0 lt t1 lt middot middot middot lt tm be the times at whichprices are observed and consider the case where we sample in-traday returns at the highest possible frequency in tick time Wesuppress the subscript m to simplify the notation Suppose thatthe noise is given by ui = αylowasti + εi where εi is a sequence of iidrandom variables with mean 0 and variance var(εi)= ω2 Thusα = 0 corresponds to the case with independent noise assump-tion and α = ω2 = 0 corresponds to the case without noise Itnow follows that
ei = α(ylowasti minus ylowastiminus1)+ εi minus εiminus1
such that
E(e2i )= α2(σ 2
i + σ 2iminus1)+ 2ω2
and
E(eiylowasti )= ασ 2
i
wheresumm
i=1 σ 2i = IV Thus
E[RV(1 tick)
]= IV + 2α(1+ α)IV + 2mω2
with a bias given by 2α(1 + α)IV + 2mω2 This bias may benegative if α lt 0 (the case where ui and ylowasti are negatively cor-related) Now we also have
E(eieiminus1)=minusα2σ 2iminus1 minusω2
and
E(eiylowastiminus1)=minusασ 2
iminus1
such thatmsum
i=1
E[yiyi+1] = minusα2IV minus 2mω2 minus αIV
which shows that RV(1 tick)AC1
is almost unbiased for the IV
In this simple example ui is only contemporaneously corre-lated with ylowasti In practice it is plausible that ui could also becorrelated with lagged values of ylowasti which would yield a morecomplicated time dependence in tick time In this situation wecould use RV(1 tick)
ACq with a q sufficiently large to capture the
time dependenceAssumption 4 and Theorem 2 are formulated for the case
with CTS but a similar estimator can be defined under depen-dence in tick time The lower four panels of Figure 5 are the
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
140 Journal of Business amp Economic Statistics April 2006
Figure 5 Volatility Signature Plots of RV (1 sec)ACq
(four upper panels) and RV (1 tick)ACq
(four lower panels) for Each of the Price Series Ask
Quotes ( ) Bid Quotes ( ) Mid-Quotes ( ) and Transaction Prices ( ) The x-axis is the number of autocovariance terms
q included in RV (1 sec)ACq
The left column is for AA and the right column is for MSFT The results for 2000 are the panels in rows 1 and 3 and those
for 2004 are in rows 2 and 4 The horizontal line represents an estimate of the average IV σ 2 equiv RV(1 tick)ACNW30
that is defined in Section 42 Theshaded area about σ 2 represents an approximate 95 confidence interval for the average volatility
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 141
signature plots for RV(1 tick)ACq
where q is the number of auto-covariances used to bias correct the standard RV From theseplots we see that a correction for the first couple of autoco-variances has a substantial impact on the estimator but higher-order autocovariances are also important because the volatilitysignature plots do not stabilize until q ge 30 in some cases (egMSFT in 2000) This time dependence was longer than we hadanticipated thus we examined whether a few ldquounusualrdquo dayswere responsible for this result However the upward-slopingvolatility signature (until q is about 30) is actually found in mostdaily plots of RV(1 tick)
ACqagainst q for MSFT in the year 2000
In Section 3 we analyzed the simple kernel estimator thatincorporates only the first-order autocovariance of intraday re-turns which we now generalize by including higher-order auto-covariances We did this to make the estimator RV(1 tick)
ACq robust
to both time dependence in the noise and correlation betweennoise and efficient returns Interestingly Zhou (1996) also pro-posed a subsample version of this estimator although he did notrefer to it as a subsample estimator As is the case for RV(1 tick)
ACq
this estimator is robust to time dependence that is finite in ticktime Next we describe the subsample-based version of Zhoursquosestimator
Let t0 lt t1 lt middot middot middot lt tN be the times at which prices are ob-served in the interval [ab] where a = t0 and b = tN Herem need not be equal to N (unlike the situation in the previous ex-ample) because we use intraday returns that span several priceobservations We use the following notation for such (skip-k)intraday returns
ytiti+k equiv pti+k minus pti
Thus ytiti+k is the intraday return over the time interval [ti ti+k]This leads to the identity
RV(k tick)AC1
=sum
iisin0k2kNminusk
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
)
(assuming that Nk is an integer) which is a sum involvingm = Nk terms The subsample version RV(m)
AC1 proposed by
Zhou (1996) can be expressed as
1
k
Nminus1sumi=0
(y2
titi+k+ ytiminuskti ytiti+k + ytiti+k yti+kti+2k
) (6)
Thus for k = 2 we have
1
2
Nsumi=1
(yi + yi+1)2 + (yiminus1 + yiminus2)(yi + yi+1)
+ (yi + yi+1)(yi+2 + yi+3)
where yi equiv ytiminus1ti Rearranging the terms we see that this sumis approximately given by
1
2
Nsumi=1
2y2i + 4yiyi+1 + 4yiyi+2 + 2yiyi+3
= γ0 + (γminus1 + γ1)+ (γminus2 + γ2)+ 1
2(γminus3 + γ3)
where γj equiv sumNi=1 yiyi+j For the general case k ge 2 we can
show that this subsample estimator is approximately given by
RV(1 tick)ACNWk
equiv γ0 +ksum
j=1
(γminusj + γj)+ksum
j=1
k minus j
k(γminusjminusk + γj+k)
Thus this estimator equals the RV(1 tick)ACk
plus an additional terma Bartlett-type weighted sum of higher-order covariances In-terestingly Zhou (1996) showed that the subsample version ofhis estimator [see (6)] has a variance that is (at most) of orderO( k
N )+O( 1k )+O( N
k2 ) (assuming constant volatility) This term
is of order O(Nminus13) if k is chosen to be proportional to N23as was done by Zhang et al (2005) It appears that Zhou mayhave considered k as fixed in his asymptotic analysis becausehe referred to this estimator as being inconsistent (see Zhou1998 p 114) Therefore the great virtues of subsample-basedestimators in this context were first recognized by Zhang et al(2005)
5 EMPIRICAL ANALYSIS
We now analyze stock returns for the 30 equities of the DJIAThe sample period spans 5 years from January 3 2000 to De-cember 31 2004 We report results for each of the years indi-vidually but give some of the more detailed results only for theyears 2000 and 2004 to conserve space The tick size was re-duced from 116 of 1 dollar to 1 cent on January 29 2001 andto avoid mixing mix days with different tick sizes we drop mostof the days during January 2001 from our sample The data aretransaction prices and quotations from NYSE and NASDAQand all data were extracted from the Trade and Quote (TAQ)database
We filtered the raw data for outliers and discarded transac-tions outside the period 930 AMndash400 PM and removed dayswith less than 5 hours of trading from the sample This reducedthe sample to the number of days reported in the last column ofTable 1 The filtering procedure removed obvious data errorssuch as zero prices We also removed transaction prices thatwere more than one spread away from the bid and ask quotes(Details of the filtering procedure are described in a techni-cal appendix available at our website) The average number oftransactionsquotations per day are given for each year in oursample these reveal a steady increase in the number of trans-actions and quotations over the 5-year period The numbers inparentheses are the percentages of transaction prices that dif-fer from the proceeding transaction price and similarly for thequoted prices The same price is often observed in several con-secutive transactionsquotations because a large trade may bedivided into smaller transactions and a ldquonewrdquo quote may sim-ply reflect a revision of the ldquodepthrdquo while the bid and ask pricesremain unchanged We use all price observations in our analy-sis Censoring all of the zero intraday returns does not affectthe RV but has an impact on the autocorrelation of intradayreturns
Our analysis of quotation data is based on bid and askprices and the average of these (mid-quotes) The RVs are cal-culated for the hours that the market is open approximately390 minutes per day (65 hours for most days) Our tablespresent results for all 30 equities whereas our figures present
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
142 Journal of Business amp Economic Statistics April 2006
Tabl
e1
Equ
ityD
ata
Sum
mar
yS
tatis
tics
Tran
sact
ions
day
Quo
tes
day
No
ofda
ysS
ymbo
lN
ame
Exc
hang
e20
0020
0120
0220
0320
0420
0020
0120
0220
0320
04
AA
Alc
oaN
YS
E99
7 (41
)1
479 (
50)
189
8 (50
)2
153 (
45)
315
0 (43
)1
384 (
33)
187
5 (44
)2
775 (
38)
530
9 (32
)7
560 (
34)
122
3A
XP
Am
eric
anE
xpre
ssN
YS
E1
584 (
45)
244
9 (54
)2
791 (
53)
337
1 (52
)2
752 (
44)
200
4 (42
)3
123 (
46)
374
5 (41
)7
293 (
43)
746
4 (34
)1
221
BA
Boe
ing
Com
pany
NY
SE
105
2 (39
)1
730 (
49)
230
6 (52
)2
961 (
48)
303
7 (47
)1
587 (
30)
249
2 (46
)3
552 (
43)
647
5 (42
)8
022 (
39)
122
1C
Citi
grou
pN
YS
E2
597 (
44)
312
9 (51
)3
997 (
49)
442
4 (46
)4
803 (
47)
307
6 (27
)3
994 (
42)
503
9 (40
)7
993 (
37)
931
6 (37
)1
221
CAT
Cat
erpi
llar
Inc
NY
SE
747 (
40)
126
0 (50
)1
673 (
53)
241
4 (54
)3
154 (
56)
103
9 (33
)2
002 (
43)
287
9 (40
)5
852 (
46)
818
5 (51
)1
221
DD
Du
Pon
tDe
Nem
ours
NY
SE
125
7 (48
)1
853 (
54)
210
0 (53
)2
963 (
51)
307
7 (49
)1
858 (
30)
291
5 (43
)3
418 (
39)
666
3 (41
)8
069 (
38)
122
0D
ISW
altD
isne
yN
YS
E1
304 (
39)
201
3 (53
)2
882 (
54)
344
8 (48
)3
564 (
46)
152
5 (25
)2
893 (
43)
475
0 (36
)7
768 (
33)
848
0 (34
)1
221
EK
Eas
tman
Kod
akN
YS
E77
5 (42
)1
128 (
50)
139
2 (45
)2
008 (
45)
194
1 (44
)1
181 (
35)
194
1 (44
)2
322 (
37)
491
0 (35
)5
715 (
31)
122
0G
EG
ener
alE
lect
ricN
YS
E3
191 (
41)
315
7 (52
)4
712 (
54)
482
8 (48
)4
771 (
44)
288
8 (34
)3
646 (
48)
555
9 (42
)8
369 (
31)
100
51(2
4)1
221
GM
Gen
eral
Mot
ors
NY
SE
988 (
37)
135
7 (50
)2
448 (
49)
287
4 (49
)2
855 (
47)
157
2 (35
)2
009 (
44)
424
6 (37
)6
262 (
35)
688
9 (34
)1
221
HD
Hom
eD
epot
Inc
NY
SE
196
1 (42
)2
648 (
51)
353
3 (52
)3
843 (
49)
364
2 (46
)2
161 (
31)
294
6 (51
)4
181 (
42)
724
6 (36
)8
005 (
31)
122
1H
ON
Hon
eyw
ell
NY
SE
112
2 (38
)1
453 (
52)
187
2 (47
)2
482 (
47)
266
8 (47
)1
378 (
33)
236
4 (47
)3
120 (
40)
538
5 (40
)6
542 (
39)
122
1H
PQ
Hew
lettndash
Pac
kard
NY
SE
190
0 (46
)2
321 (
51)
254
3 (46
)3
263 (
43)
370
8 (43
)2
394 (
45)
317
0 (43
)3
226 (
36)
653
6 (31
)8
700 (
27)
121
8IB
MIn
tB
usin
ess
Mac
hine
sN
YS
E2
322 (
55)
363
7 (63
)3
492 (
61)
411
1 (60
)4
553 (
55)
331
9 (53
)4
729 (
55)
668
8 (37
)7
790 (
49)
944
7 (51
)1
220
INT
CIn
tel
Cor
pN
AS
D14
982
(73)
156
37(8
3)15
194
(80)
109
18(7
1)9
078 (
67)
105
99(1
7)12
512
(32)
176
22(4
5)19
554
(39)
198
28(4
2)1
230
IPIn
tern
atio
nalP
aper
NY
SE
100
2 (40
)1
509 (
50)
197
1 (50
)2
569 (
47)
228
3 (44
)1
368 (
31)
234
6 (41
)3
134 (
37)
599
7 (40
)6
417 (
34)
122
1JN
JJo
hnso
namp
John
son
NY
SE
149
2 (49
)2
059 (
57)
254
1 (60
)3
272 (
53)
385
3 (48
)1
739 (
36)
232
2 (47
)3
091 (
42)
652
9 (39
)8
687 (
36)
122
1JP
MJ
PM
orga
nN
YS
E1
317 (
52)
255
5 (51
)2
973 (
52)
383
6 (49
)3
939 (
46)
183
9 (52
)3
384 (
46)
407
9 (39
)7
387 (
36)
880
8 (30
)1
221
KO
Coc
a-C
ola
NY
SE
137
6 (45
)1
662 (
51)
234
9 (52
)2
854 (
50)
332
0 (48
)1
635 (
32)
206
3 (48
)3
221 (
42)
624
0 (39
)7
498 (
35)
122
1M
CD
McD
onal
dsN
YS
E1
109 (
39)
177
9 (50
)2
226 (
49)
263
8 (45
)2
790 (
43)
157
8 (21
)2
334 (
42)
306
5 (37
)5
763 (
33)
733
1 (31
)1
221
MM
MM
inne
sota
Mng
Mfg
N
YS
E86
8 (44
)1
563 (
56)
205
4 (57
)3
092 (
58)
337
0 (48
)1
157 (
45)
246
2 (50
)3
253 (
48)
710
0 (52
)8
228 (
46)
122
0M
OP
hilip
Mor
risN
YS
E1
283 (
38)
194
2 (53
)3
047 (
54)
347
3 (50
)3
404 (
48)
236
5 (13
)3
266 (
34)
417
4 (40
)6
776 (
38)
772
1 (38
)1
220
MR
KM
erck
NY
SE
183
2 (41
)1
943 (
51)
257
4 (52
)3
639 (
50)
366
3 (45
)2
021 (
35)
238
1 (48
)3
262 (
41)
692
7 (43
)7
658 (
34)
122
1M
SF
TM
icro
soft
NA
SD
139
00(6
8)14
479
(86)
152
57(8
5)12
013
(73)
864
3 (66
)8
275 (
14)
133
41(4
0)18
234
(66)
198
33(4
5)19
669
(41)
122
9P
GP
roct
eramp
Gam
ble
NY
SE
151
8 (44
)1
881 (
54)
263
1 (52
)3
314 (
52)
366
8 (50
)2
584 (
27)
313
7 (43
)4
555 (
42)
753
5 (47
)8
397 (
45)
122
1S
BC
Sbc
Com
mun
icat
ions
NY
SE
160
3 (37
)2
267 (
52)
303
8 (52
)3
060 (
46)
336
4 (43
)2
042 (
24)
279
1 (48
)3
909 (
39)
647
8 (31
)8
346 (
26)
122
0T
ATamp
TC
orp
NY
SE
223
3 (29
)1
851 (
40)
222
4 (43
)2
353 (
44)
241
2 (39
)1
657 (
26)
223
8 (40
)2
931 (
32)
530
7 (30
)7
091 (
20)
122
1U
TX
Uni
ted
Tech
nolo
gies
NY
SE
767 (
41)
135
4 (51
)1
986 (
53)
275
8 (55
)3
107 (
55)
111
1 (46
)2
183 (
46)
307
5 (45
)6
657 (
49)
799
6 (53
)1
220
WM
TW
al-M
artS
tore
sN
YS
E1
946 (
43)
236
8 (54
)2
847 (
58)
286
0 (51
)4
394 (
50)
260
9 (27
)2
675 (
47)
397
1 (44
)5
610 (
39)
874
4 (40
)1
221
XO
ME
xxon
Mob
ilN
YS
E1
720 (
41)
222
7 (53
)3
467 (
54)
419
8 (48
)4
489 (
47)
197
9 (33
)2
712 (
43)
467
7 (37
)8
340 (
32)
995
0 (29
)1
219
NO
TE
S
ymbo
lsan
dna
mes
ofth
eeq
uitie
sus
edin
our
empi
rical
anal
ysis
The
third
colu
mn
isth
eex
chan
geth
atw
eex
trac
ted
data
from
and
the
follo
win
gco
lum
nsgi
veth
eav
erag
enu
mbe
rof
tran
sact
ions
quo
tatio
nspe
rda
yfo
rea
chof
the
five
year
sin
our
sam
ple
The
perc
enta
geof
obse
rvat
ions
for
whi
chth
etr
ansa
ctio
npr
ice
oron
eof
the
quot
edpr
ices
was
diffe
rent
from
the
prev
ious
one
isgi
ven
inpa
rent
hese
sT
hela
stco
lum
nis
the
tota
lnum
ber
ofda
tain
our
sam
ple
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 143
results for two equities Alcoa (AA) and Microsoft (MSFT)which represent DJIA equities with low and high trading activ-ities The corresponding figures for the other 28 DJIA equitiesare available on request
51 Empirical Implementation of Estimators
In practice we do not rely on the intraday returns outside the[ab] interval Thus in our empirical analysis we substitute (forh gt 0)
γh equiv m
mminus h
mminushsumi=1
yimyi+hm
for the theoretical quantity
γh equivmsum
i=1
yimyi+hm
because the latter relies on ym+1m ym+hm (For h lt 0 wedefine γh equiv γ|h|) In the expression for γh we use an upwardscaling m(m minus h) of the ldquoautocovariancesrdquo to compensatefor the ldquomissingrdquo autocovariance terms Thus our empiricalimplementation of the simplest kernel estimator is given byRV(m)
AC1=summ
i=1 y2im + 2 m
mminus1
summminus1i=1 yimyi+1m
52 Estimation of Market MicrostructureNoise Parameters
Under the independent noise assumption used in Section 3we have from Lemma 2 that
ω2 equiv RV(m)
2m
prarr ω2 as m rarrinfin
This estimator was proposed by Bandi and Russell (2005)and Zhang et al (2005) for different purposes Bandi andRussell (2005) used ω2
t to determine the optimal sampling fre-quency mlowast
0 whereas Zhang et al (2005) used it to select thenumber of subsamples and to serve as a bias-correction deviceof their ldquosecond-bestrdquo one-scale subsample estimator
Under the independent noise assumption we have fromLemma 2 that E(RV(m)) = IV + 2mω2 Whereas ω2 is as-ymptotically justified our empirical results reveal that ω2 isvery small in practice so small that 2mω2 is small relative tothe IV with the exception of the most liquid assets such asINTC and MSFT Whenever the IV2m is nonnegligible ω2
will overestimate ω2 Thus better estimators are given by
ω2 equiv RV(m) minus RV(13)
2(mminus 13)
and
ω2 equiv RV(m) minus IV
2m
where RV(13) is based on intraday returns that span about30 minutes each and IV is some unbiased estimator of IV FromLemmas 2 and 3 it follows that ω2 ω2 and ω2 are asymptoti-cally equivalent in the sense that they have the same probabilitylimit as m rarrinfin But ω2 and ω2 are unbiased for ω2 for anyfinite m and we show that ω2 is quite biased in many casesAnother problem is that the independent noise assumption need
not hold at ultra-high frequencies in which case the asymptoticbias is not given by 2mω2 Clearly this is problematic for allthree estimators
Table 2 presents annual sample averages of ω2 ω2 and ω2
for the 5 years of our sample Here we use RV(1 tick)AC1
as ourchoice for IV which is unbiased under the independent noiseassumption The first estimator ω2 assumes that IV2m is neg-ligible in which case all three estimators should be similar Be-cause ω2 and ω2 generally agree whereas ω2 is typically muchlarger it is evident that ω2 overestimates ω2 A related obser-vation was made by Engle and Sun (2005)
Fact III The noise is smaller than one might think
Here by small we mean that the bias of the RV due to noise issmall This is particularly so in the more recent years (see egthe 2004 volatility signature plot for AA in Fig 1) By smallwe do not mean that the noise is unimportant but rather that ithas a less dramatic impact than suggested by the independentnoise assumption
The fact that the noise in each of the intraday returns is rela-tively small (even when sampling occurs at the highest possiblefrequency) will likely affect the properties of the suggested im-plementation of the two-scale estimator by Zhang et al (2005)In fact the one-scale estimator proposed by Zhang et al (2005)may be more accurate when the noise is small as Table 2 sug-gests Similarly when determining the optimal sampling fre-quency for RV (as in Bandi and Russell 2005) one should becareful not to overestimate ω2 which would lead to a lower-than-optimal sampling frequency The important message isthat one should incorporate RVAC1 or some other unbiased es-timator of IV when estimating ω2 from RV
Another observation from Table 2 is that the noise haschanged For example our 2001 estimates of ω2 (using ω2)
are on average lt20 of those of 2000 and are even smallerin subsequent years A large portion of this reduction is due todecimalization We discuss the changed empirical properties ofthe noise in more detail in our discussion of the results in Fig-ures 6 and 7
Now if we were to believe the independent noise assumption(which is less of a stretch in 2000 than in subsequent years)then we could construct the following estimate of λ
λequiv ω2IV
where ω2 equiv nminus1 sumnt=1 ω2
t and IV equiv nminus1 sumnt=1 RV(1 tick)
AC1t The
latter is an estimate of the average daily IV over the samplet = 1 n because RVAC1 is unbiased for IV under the inde-pendent noise assumption Similarly ω2 is an estimate of theaverage daily noise If both ω2 and IV are constant across dayssuch that λ is the same for all days then λ is consistent for λ
whenever a law of large numbers applies to ω2t and RV(1 tick)
AC1t
t = 1 n In practice both ω2 and IV are likely to vary acrossdays so λ should be viewed only as a proxy for the noise-to-signal ratio
Table 3 contains empirical results for all 30 equities usingboth transaction and quotation data from 2000 Several inter-esting observations can be made based on these results For thetransaction data we note that λ is typically found to be lt1and the theoretical reduction of the RMSE 100[r0(λmlowast
0) minus
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
144 Journal of Business amp Economic Statistics April 2006
Tabl
e2
Est
imat
esof
the
Noi
seV
aria
nce
for
Tran
sact
ion
Dat
a(a
nnua
lave
rage
s)
2000
2001
2002
2003
2004
Ass
etω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2ω
2
AA
178
69
951
040
447
098
117
409
150
100
201
040
055
090
011
024
AX
P6
181
892
612
710
540
562
570
750
600
840
230
230
330
060
10B
A1
174
610
681
292
026
045
324
118
069
162
070
044
060
010
014
C6
334
074
382
220
850
282
560
350
300
620
160
140
340
160
13C
AT1
672
724
834
263
minus03
20
282
07minus
025
029
080
minus00
40
080
410
010
03D
D1
130
662
676
231
050
058
228
068
048
087
035
024
053
020
016
DIS
163
81
179
120
55
632
731
945
393
442
582
311
511
131
190
800
62E
K9
122
654
133
82minus
084
020
361
minus03
10
371
640
100
381
24minus
003
028
GE
541
390
361
196
062
031
186
084
050
089
052
049
051
031
033
GM
639
018
225
222
minus01
40
361
78minus
001
043
092
013
025
052
001
014
HD
971
592
613
256
058
054
245
091
083
114
050
053
058
017
024
HO
N1
374
458
599
537
089
090
451
079
080
217
088
058
098
030
024
HP
Q8
212
322
467
163
201
937
113
502
542
481
081
011
420
890
78IB
M3
421
341
491
120
310
211
180
290
200
400
130
090
240
110
06IN
TC
232
186
208
209
171
186
077
042
054
055
034
039
046
030
032
IP2
015
104
91
156
431
167
092
234
068
051
117
040
026
067
007
016
JNJ
373
196
212
132
048
049
161
066
065
067
018
021
029
008
011
JPM
426
minus00
40
213
681
870
975
231
941
281
250
560
520
440
160
19K
O7
764
354
981
880
350
351
460
460
330
730
260
190
440
200
16M
CD
196
61
517
146
63
361
721
173
511
741
262
461
201
150
990
440
46M
MM
492
minus03
30
711
90minus
007
minus00
21
190
140
100
320
040
030
300
010
02M
O2
891
237
62
487
167
028
044
130
060
038
097
028
025
043
002
011
MR
K4
992
422
881
590
190
151
570
290
230
560
090
110
500
130
19M
SF
T2
251
942
060
730
520
600
250
070
130
360
250
270
370
290
29P
G6
303
283
151
850
720
450
850
150
120
310
090
040
270
070
06S
BC
992
596
662
199
031
049
361
131
087
176
064
061
094
052
052
T2
135
170
41
756
467
157
138
544
198
222
227
058
086
203
103
111
UT
X9
77minus
049
120
246
minus04
4minus
006
189
minus00
30
170
640
050
060
380
080
05W
MT
892
462
545
184
029
030
131
025
025
054
002
009
032
008
012
XO
M3
481
482
051
430
460
311
520
840
370
630
370
250
310
120
15
NO
TE
A
nnua
lave
rage
sof
estim
ates
ofω
2fo
rtr
ansa
ctio
nda
taus
ing
the
thre
edi
ffere
ntes
timat
ors
ω2equiv
RV
(m)
(2m
)ω
2equiv
(RV
(m)minus
RV
(13)
)(2
(mminus
13))
and
ω2equiv
(RV
(m)minus
IV)
(2m
)A
llnu
mbe
rsha
vebe
enm
ultip
lied
by10
0
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 145
Figure 6 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2000
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
146 Journal of Business amp Economic Statistics April 2006
Figure 7 The ACF ( ) and PACF ( ) for Four Different Tick-by-Tick Price Series Transaction Prices Bid Quotes Ask Quotes andMid-Quotes The left column is for AA and the right column is for MSFT The plotted series are the annual averages over the days in the year 2004
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 147
Table 3 Estimated Noise-to-Signal Ratios and ldquoOptimalrdquo Sampling Frequenciesfor the Year 2000
Trades QuotesAsset ω2 middot 100 IV λ middot 100 mlowast
0 mlowast1 ∆RMSE ω2 middot 100
AA 10400 6141 1693 44 511 331 0026AXP 2605 5244 0497 100 1743 436 minus0351BA 6807 4181 1628 45 531 335 minus0207C 4383 4610 0951 65 910 382 minus0367CAT 8344 5239 1593 46 543 337 minus0192DD 6756 5769 1171 56 739 364 minus0274DIS 12050 4322 2789 31 310 287 minus0292EK 4133 3495 1183 56 732 363 minus0153GE 3609 4740 0762 75 1137 401 minus0221GM 2249 3242 0694 80 1248 409 minus0338HD 6125 5883 1041 61 831 374 minus0513HON 5991 6669 0898 67 964 387 minus0671HPQ 2457 1031 0238 163 3632 494 minus0643IBM 1493 5120 0292 143 2969 478 minus0042INTC 2077 5884 0353 126 2453 463 minus0446IP 11560 7520 1538 47 563 340 minus0286JNJ 2116 2445 0866 69 1000 390 minus0254JPM 0212 5726 0037 566 23350 620 minus0387KO 4978 3656 1361 51 636 351 minus0346MCD 14660 4557 3218 28 269 274 0098MMM 0707 3377 0209 178 4134 503 minus0549MO 24870 4092 6078 18 142 216 0276MRK 2883 3287 0877 68 987 389 minus0143MSFT 2063 3557 0580 90 1493 423 minus0256PG 3145 4719 0667 82 1299 412 minus0104SBC 6617 3912 1691 44 512 331 minus0415T 17560 4748 3698 26 234 261 minus0504UTX 1204 5691 0212 177 4094 503 minus0743WMT 5454 5857 0931 66 929 384 minus0535XOM 2046 2160 0947 65 914 382 minus0259
NOTE Estimates of ω2 IV and the noise-to-signal ratio λ (annual averages for 2000) Columns five and six are the op-timal sampling frequencies for RV (m) and RV (m)
AC1based on the listed value of λ The corresponding reduction in the RMSE
100[r 0(λmlowast0 ) minus r 1(λmlowast
1 )]r 0(λ mlowast0 ) is listed in column seven The last column contains estimates of ω2 (annual average) using
mid-quotes where negative values are indicative of negative correlation between noise and efficient returns
r1(λmlowast1)]r0(λmlowast
0) is typically in the 25ndash50 range For ex-ample for Alcoa that ω2
AA = 104 and λAA = 104614 =1693 which leads to the optimal sampling frequenciesmlowast
0 = 44 and mlowast1 = 511 For a typical trading day spanning
65 hours this corresponds to intraday returns that (on average)span 9 minutes and 46 seconds Bandi and Russell (2005) andOomen (2005 2006) reported ldquooptimalrdquo sampling frequenciesfor RV(m) that are similar to our estimates of mlowast
0 By pluggingthese numbers into the formulas of Corollary 2 we find the re-
duction of the RMSE (from using RV(mlowast
1)
AC1rather than RV(mlowast
0))to be 33
The noise-to-signal ratio λ is likely to differ across daysin which case the optimal sampling frequencies mlowast
0 and mlowast1
will also differ across days Thus λ should be viewed as aproxy for λ on a typical trading day and our estimates shouldbe viewed as approximations for ldquodaily average valuesrdquo in thesense that m0 = 90 and m1 = 1493 appear to be sensible sam-pling frequencies to use with the MSFT transaction data For-tunately Figure 1 shows that RV(m)
AC1is relatively insensitive
to small deviations from mlowast1 such that a reasonable estimate
for λ does produce a more accurate estimator based on RVAC1
than one based on RVFor the quotation data almost all of our estimates of
ω2 are negative which occurs whenever the sample average ofRV(1 tick) is smaller than that of RV(1 tick)
AC1 This is obviously in
conflict with the results of Lemmas 2 and 3 which dictate thatthe population difference E[RV(1 tick)minusRV(1 tick)
AC1] be positive
The expected difference is 2ω2 times a number that is propor-tional to the average number of transactionsquotations per dayOne explanation for the observation that ω2 lt 0 is that ω2 0such that the ldquowrongrdquo sign occurs simply by chance But this ishighly improbable because all but two of the estimates of ω2
(not just about half of them) are found to be negative for thequotation data Thus these negative estimates provide additionalevidence against the independent noise assumption
The autocorrelation function (ACF) and the partial autocor-relation function (PACF) for intraday returns provide a sim-ple eyeball test of the independent noise assumption Figures6 and 7 present annual averages of the ACF and PACF estimatedfor each day using one-tick sampling The results for 2000 aregiven in Figure 6 those for 2004 in Figure 7 The upper pan-els are those for transaction prices and the subsequent panelscorrespond to ask mid and bid quotes
The results for AA in 2000 suggest that time dependencein the noise process may be specific to tick time and that thememory in transaction prices lasts only a few ticks slightlylonger for quoted prices In 2004 the time dependence in trans-action prices appears to be longermdashat least 10 transactionsmdashand because we are looking at an annual average it may beeven longer for some days The duration between transactions
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
148 Journal of Business amp Economic Statistics April 2006
in 2004 was about a third of what it was in 2000 Thus when thetime dependence in tick time is converted into calendar timethe time dependence in 2000 and 2004 is about the same Theresults for MSFT are in some respects very different from thosefor AA The average ACF and PACF for the transaction datamay suggest that the independent noise assumption is appro-priate for this price series however many of the higher-orderautocovariances are nontrivial which explains why RV(1 tick)
AC1is
often very different from RV(1 tick)AC30
For the quoted price seriesthe time dependence is slightly more involved particularly formid-quotes
Comparing the results in Figures 6 and 7 shows that thenoise properties have changed after decimalization of the ticksize For transaction data the change in the noise properties ismost evident for AA whereas in quoted prices the change ismost pronounced for MSFT For example the first-order au-tocovariances for bid and ask quotes have opposite signs in2000 and 2004 Thus based on the results in Table 2 and Fig-ures 6 and 7 we are led to the following fact
Fact IV The properties of the noise have changed over time
Because the ACFs and PACFs plotted in Figures 6 and 7 areaverages over daily estimates there may be a great deal of vari-ation across days although we did not find this to be the casein the daily ACFs and PACFs (For formal hypothesis tests thataddress properties of market microstructure noise [day-by-day]see Awartani Corradi and Distaso 2004)
6 PRICE DECOMPOSITION BYCOINTEGRATION METHODS
Empirical studies on volatility estimation from contaminatedhigh-frequency returns have typically used univariate price se-ries such as transaction prices ask quotes bid quotes or mid-quotes All of these series are proxies for the same efficientprice and thus it is natural to incorporate the information fromall series when estimating the volatility of the latent efficientprice
One way to combine the information from multiple series isto compute an RV-type measure for each series and take theaverage of these as was done by Hansen and Lunde (2005b)who took the average of estimators based on bid and ask quotesHere we take a different approach and use cointegration meth-ods to extract the efficient price (and noise) from a vectorconsisting of bid and ask quotes and transaction prices Thisapproach can be extended to include additional price series forexample limit order books can be used to define additional bidand ask price series that depend on the volume offered at var-ious prices and prices from multiple exchanges could also beincluded in the analysis
The purpose of this analysis is threefold First the methodmakes it possible to decompose the different prices into a com-mon stochastic trend (efficient price) and transitory components(one noise process for each price series) This makes it possi-ble to compare the properties of these series with those that weobserve in our volatility signature plots Second the decom-position reveals how the efficient price is tied to innovationsin the different price series This shows which price series aremost informative about the efficient price Third we can study
the dynamic impact on quotes and transaction prices as a re-sponse to a change in the efficient price This can be done withstandard impulse response analysis (For alternative ways of de-composing the observed price series see eg Engle and Sun2005 who used a ACDndashGARCH specification where the noiseprocess has a two-component ARMA structure also see Frijnsand Lehnert 2004)
We use the vector autoregressive framework that was usedto analyze price discovery (from multiple markets) by HarrisMcInish Shoesmith and Wood (1995) Hasbrouck (1995) andHarris McInish and Wood (2002) among others Our analysisdiffers from this literature because we apply cointegration tech-niques to quotes and transactions in conjunction not to transac-tion prices from different exchanges Because it is possible toobtain the prevailing bid and ask prices at the time at which atransaction occurs we avoid issues related to nonsynchronoustrading Our impulse response analysis which we believe to benovel shows how bid ask and transaction prices dynamicallyrespond to a change in the efficient price
We let ti i = 01 m denote the times when transactionsoccur during some trading day and drop the subscript m to sim-plify the notation We define the vector of ldquopricesrdquo by
pti = transaction price at time ti
prevailing ask price at time tiprevailing bid price at time ti
where we use log-prices as in the previous sectionsSuppose that the dynamics of the vector of prices pti can
be approximated by the vector autoregressive error correctionmodel
pti = αβ primeptiminus1 +kminus1sumj=1
jptiminusj +micro+ εti (7)
where εti i = 1 m is a sequence of uncorrelated errorterms It is reasonable to assume that each of the three observedprice series shares the same stochastic trend such that any pairof prices cointegrate because the spread is presumed to be sta-tionary This implies that α and β are 3 times 2 matrices with fullcolumn rank and that we may impose the two natural cointegra-tion vectors
β = (β1β2)=
1 0minus 1
2 1
minus 12 minus1
(8)
The space spanned by the two columns of β defines the setof cointegration relations Thus any two linearly independentvectors in this subspace can be used to define β We choosethese two vectors because they are simple to interpret β prime
1pti isthe difference between the transaction price and the mid-quoteand β prime
2pti is the bidndashask spreadKey to this analysis are the 3times 1 vectors αperp and βperp which
are orthogonal to α and β [Thus αprimeperpα = β primeperpβ = (00)] As isthe case for α and β these vectors are not fully identified sowe impose the normalizations
βperp = ι and αprimeperpι= 1
where ιequiv (111)prime
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 149
It follows from the Granger representation theorem that
pt = βperp(αprimeperpβperp)minus1αprimeperptsum
s=1
εs +C(L)(εt +micro)+A0
where equiv I minus 1 minus middot middot middot minus kminus1 and A0 is a that depending oninitial values of pt The Granger representation theorem is dueto Johansen (1988) and Hansen (2005) who obtained a recur-sive formula for the coefficients of C(L) (and details about A0)which we use in our analysis
61 Common Stochastic Trend
There are a number of ways to define the common stochastictrend from the Granger representation (see Johansen 1996) Themost natural definition in this framework is
plowastti equiv (αprimeperpβperp)minus1isum
j=1
αprimeperpεtj + initial value
This definition has the desired martingale property because thecorresponding intraday returns
ylowastti equivαprimeperpεti
αprimeperpβperp
are uncorrelated Thus the Granger representation can be ex-pressed as
pti =
plowasttiplowasttiplowastti
+C(L)εti +A0
This definition of the common stochastic trend plowastti correspondsto that of Hasbrouck (1995 2002) Johansen (1996) also dis-cussed alternative definitions of the common stochastic trendsuch as β primeperppti and αprimeperppti which are linear combinations ofthe observed price vector these are known as GrangerndashGonzalostochastic trends in the literature (see Gonzalo and Granger1995) The connection between plowastti and a linear combinationof pti follows directly from the Granger representation be-cause the latter involves a component that is proportionalto plowastti and a second component that relates to the station-ary part of the Granger representation This relation was dis-cussed by Johansen (1996) (see also Hansen and Johansen1998 pp 27ndash28) and similar arguments were given by de Jong(2002) and Baillie Booth Tse and Zabotina (2002) (see alsoHarris et al 2002 Lehmann 2002)
It is the martingale property of plowastti that makes plowastti the mostnatural definition of the efficient price and the RV of plowastti is givenby
RVplowast equivmsum
i=1
(ylowastti)2 =
summi=1(α
primeperpεti)2
(αprimeperpβperp)2
Naturally the parameters need to be estimated in practice sowe rely on
plowastti = (αprimeperpβperp)minus1
isumj=1
αprimeperpεtj
where εti are the residuals from (7) The estimation is describedin Appendix C Given plowastti we can now define
RVplowast equivmsum
i=1
(ylowasttj
)2 =summ
i=1(αprimeperpεti)
2
(αprimeperpβperp)2
62 Empirical Results
We estimate the parameters in (7) for each of the days indi-vidually with the number of lags k chosen to be the smallestnumber (le10) that made LjungndashBox tests at lags 5 10 15 and30 insignificant at the 5 significance level Some additionaldetails about the estimation are given in Appendix C
Assuming that the ultra-highndashfrequency price observationsare well described by (7) is problematic for several reasonsThe duration between observations is an important aspect of thedata ignored by model (7) and the discreteness of the data hasimplications for the properties of the ldquoerrorsrdquo εti i = 1 mand the interdependence with the vector of prices The readershould be aware that we have ignored all such issues and thusthe results that we draw from this analysis should be viewed asa rough approximation
A key parameter in our analysis is αperp (3 times 1 vector) thatshows how the efficient price is related to ldquoinnovationsrdquo in thedifferent price series In our empirical analysis the elementsof αperp correspond to transaction prices ask quotes and bidquotes and so we use the following notation
αprimeperp = (αperptr αperpask αperpbid)
Table 4 reports a summary of our empirical estimates whichare averages of the daily estimates αperp and RVplowast For com-
parison we also report the average RVquantities RV(1 tick)
ACNW15
RV(1 tick)
ACNW30 and RV
(13) To get a sense of the dispersion of the
daily estimates we also report the 5 and 95 quantiles inbrackets below each of the averages
It is striking how similar the average αperp is across equitieslisted on the NYSE the average point estimates are generallyquite close to αperp = ( 1
2 14 1
4 )prime In contrast the two NASDAQstocks INTC and MSFT produce average estimates that standout by being closer to αperp = ( 1
4 38 3
8 )prime Thus the instantaneouscorrelation between innovations in the transaction price and theefficient price is larger for NYSE stocks and consequently thequoted prices are more closely related to the efficient price forNASDAQ stocks than to that for NYSE stocks We find thisto be a very interesting empirical observation that is likely tiedto the differences by which the two exchanges operate Specifi-cally quotes on the NASDAQ are competitive and binding suchthat movements in these are more likely to be tied to movementsin the efficient price than are movements in the specialist quoteson the NYSE
The estimated model allows us to compute the daily esti-mates
msumi=1
e2ti and
msumi=1
ylowastti eti
where eti is an element of eti equiv uti minus utiminus1 and uti = pti minus plowastti (De-meaning the elements of uti is not required because it does
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
150 Journal of Business amp Economic Statistics April 2006
Table 4 Cointegration Results for the Year 2004
Lags αperp tr αperp ask αperp bid RV plowast RV(1 tick)ACNW 15
RV (1 tick)ACNW 30
RV (13)
AA 559 51 26 22 230 252 250 221[200 10] [35 68] [09 41] [04 37] [100 461] [103 470] [90 489] [50 530]
AXP 409 46 29 26 67 71 70 69[100 100] [33 63] [13 45] [08 40] [29 133] [28 146] [24 153] [13 191]
BA 506 46 29 24 146 152 153 149[200 100] [33 61] [17 44] [07 39] [63 284] [66 301] [58 335] [34 360]
C 418 48 27 25 101 99 91 83[100 100] [33 63] [16 38] [14 35] [43 206] [40 205] [35 210] [19 180]
CAT 500 45 29 26 161 164 159 149[200 100] [33 57] [16 42] [13 42] [72 308] [73 329] [67 327] [42 351]
DD 428 44 29 27 112 111 103 102[140 100] [32 56] [13 43] [15 39] [52 206] [49 202] [42 200] [22 250]
DIS 421 41 31 29 168 164 151 136[100 100] [30 53] [18 44] [12 41] [70 307] [68 323] [55 307] [31 331]
EK 537 47 29 24 230 241 244 246[200 100] [28 69] [07 48] [minus04 43] [79 472] [84 476] [69 473] [40 627]
GE 390 46 28 26 87 96 97 87[100 800] [26 62] [15 43] [13 40] [39 180] [39 194] [36 201] [26 193]
GM 533 48 26 26 128 139 142 145[200 100] [33 67] [10 41] [10 42] [54 236] [57 283] [50 327] [33 353]
HD 493 47 27 26 137 147 148 137[200 100] [31 63] [11 40] [10 40] [60 267] [65 280] [61 294] [32 306]
HON 515 47 28 25 203 202 192 182[100 100] [33 64] [12 44] [09 40] [85 394] [83 406] [77 419] [48 355]
HPQ 436 40 31 29 207 208 197 184[100 100] [28 54] [17 42] [15 42] [80 407] [72 460] [66 447] [37 435]
IBM 492 43 29 27 90 88 81 71[200 100] [33 56] [18 42] [16 39] [36 177] [32 169] [30 164] [18 153]
INTC 460 25 37 38 236 234 230 201[200 900] [18 34] [21 50] [24 52] [107 421] [102 403] [95 394] [59 417]
IP 469 45 28 27 119 127 126 127[100 100] [30 62] [12 42] [11 44] [46 250] [51 258] [42 244] [32 311]
JNJ 445 45 29 26 81 82 80 79[200 100] [32 58] [14 43] [08 39] [33 166] [34 162] [29 171] [15 196]
JPM 387 46 28 26 108 113 111 108[100 960] [30 62] [12 42] [13 38] [46 243] [47 264] [44 293] [28 267]
KO 419 41 29 30 95 97 92 81[100 100] [28 55] [16 40] [15 42] [42 185] [43 184] [36 189] [22 195]
MCD 455 42 31 27 139 143 145 138[100 100] [27 60] [16 46] [09 41] [60 314] [52 319] [49 326] [32 345]
MMM 486 45 28 26 109 111 107 96[200 100] [33 57] [14 44] [13 40] [43 211] [44 217] [39 237] [23 210]
MO 504 48 28 25 148 151 151 152[200 100] [33 67] [09 42] [09 39] [34 317] [29 335] [25 331] [17 355]
MRK 460 48 27 25 130 138 136 130[200 100] [33 63] [12 40] [08 40] [42 384] [45 379] [40 355] [28 394]
MSFT 443 24 40 36 116 116 112 89[200 900] [16 32] [21 59] [16 54] [50 219] [50 222] [48 218] [21 218]
PG 506 49 28 23 87 87 83 75[200 100] [36 64] [15 45] [10 34] [34 164] [33 167] [29 167] [18 165]
SBC 400 38 31 30 136 144 138 128[100 100] [21 55] [15 47] [14 45] [56 264] [52 283] [44 287] [26 312]
T 412 34 34 32 165 177 186 200[100 100] [21 49] [21 46] [17 47] [62 332] [69 387] [67 443] [48 481]
UTX 516 46 28 26 119 117 111 106[200 100] [33 62] [15 42] [12 38] [45 247] [45 251] [38 239] [23 270]
WMT 498 55 23 21 111 114 110 104[200 100] [42 72] [09 36] [08 34] [53 222] [57 221] [50 205] [30 230]
XOM 409 47 27 26 78 85 85 84[100 965] [29 62] [16 40] [13 39] [34 160] [34 161] [30 168] [23 171]
NOTE Annual averages of the selected lag length αperp realized variance of plowastt two measures of RV (1 tick)
ACNWq and the standard RV based on 30-minute sampling where RV (1 tick)
ACNWqequiv
1nsumn
t = 1 (RV (1 tick) trACNWqt
+RV (1 tick) bidACNWqt
+RV (1 tick) askACNWqt
)3 and similarly for RV (13) The numbers in the squared bracket are the 5 and 95 quantiles for the different statistics
not affect eti ) Table 5 reports daily averages for the differentprice series
Figure 8 plots our estimate of the efficient price plowastti for thesame window of time plotted in Figure 2 It is comforting to seethat our estimate of the efficient price is in line with the quotedbid and ask prices Rarely is plowastti outside the bidndashask bounds so
there is no immediate evidence that a profit can be made fromthe discrepancies between plowastti and the observed prices
63 Impulse Response Function
Define the two matrices
= αβ prime and C = βperp(αprimeperpβperp)minus1αprimeperp
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 151
Tabl
e5
Var
iatio
nsan
dC
ovar
iatio
nfo
rN
oise
and
Effi
cien
tPric
efo
rth
eYe
ar20
04
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
AA
230
79
147
89
173
minus30
minus75
minus81
minus86
[10
04
61]
[39
158
][6
62
92]
[31
210
][7
73
41]
[minus1
01
13]
[minus2
13minus
05]
[minus2
05minus
09]
[minus2
30minus
12]
AX
P6
73
14
12
04
6minus
08minus
16minus
17minus
19[2
91
33]
[17
54]
[19
76]
[07
41]
[21
89]
[minus2
80
4][minus
43
00]
[minus4
5minus
01]
[minus5
0minus
01]
BA
146
63
92
46
110
minus17
minus35
minus40
minus45
[63
284
][2
81
19]
[42
180
][1
89
6][5
12
26]
[minus7
00
9][minus
93
minus03
][minus
100
minus
08]
[minus1
20minus
05]
C1
014
97
33
57
80
4minus
20minus
22minus
23[4
32
06]
[26
72]
[33
133
][1
27
4][3
71
57]
[minus1
91
7][minus
57
minus01
][minus
62
minus03
][minus
66
minus02
]C
AT1
616
39
44
51
12minus
36minus
37minus
42minus
46[7
23
08]
[27
116
][3
51
92]
[15
84]
[45
218
][minus
88
minus07
][minus
86
minus03
][minus
93
minus08
][minus
104
minus
05]
DD
112
57
81
34
87
minus04
minus22
minus22
minus22
[52
206
][3
49
2][3
71
54]
[14
74]
[42
157
][minus
28
11]
[minus6
40
1][minus
63
minus02
][minus
61
minus01
]D
IS1
681
651
576
21
663
5minus
19minus
23minus
26[7
03
07]
[88
270
][6
13
11]
[23
121
][7
03
13]
[07
74]
[minus6
61
0][minus
69
minus01
][minus
82
04]
EK
230
89
128
74
152
minus50
minus67
minus73
minus79
[79
472
][3
81
72]
[51
277
][2
11
80]
[56
342
][minus
166
03
][minus
196
minus
02]
[minus2
05minus
04]
[minus2
18minus
03]
GE
87
87
80
40
83
22
minus24
minus25
minus26
[39
180
][5
21
28]
[33
155
][1
18
3][3
81
52]
[10
38]
[minus6
20
0][minus
66
minus03
][minus
68
minus02
]G
M1
284
57
53
98
1minus
15minus
35minus
37minus
39[5
42
36]
[23
80]
[32
152
][1
39
1][3
81
52]
[minus5
30
8][minus
96
minus04
][minus
94
minus06
][minus
103
minus
07]
HD
137
70
93
48
98
minus04
minus38
minus39
minus40
[60
267
][3
61
28]
[38
178
][1
71
01]
[43
203
][minus
33
15]
[minus9
5minus
03]
[minus9
5minus
05]
[minus1
00minus
02]
HO
N2
038
51
416
71
59minus
17minus
45minus
49minus
53[8
53
94]
[43
143
][6
12
86]
[21
147
][6
93
07]
[minus6
91
9][minus
145
02
][minus
142
minus
04]
[minus1
52
00]
HP
Q2
072
021
697
51
792
7minus
33minus
36minus
40[8
04
07]
[12
82
96]
[79
317
][2
71
42]
[81
310
][minus
03
59]
[minus1
28
08]
[minus1
11
04]
[minus1
26
07]
IBM
90
40
63
26
69
minus03
minus19
minus22
minus25
[36
177
][1
76
7][2
61
23]
[09
54]
[31
129
][minus
23
10]
[minus5
1minus
01]
[minus5
9minus
03]
[minus6
9minus
04]
INT
C2
363
558
26
68
0minus
13minus
83minus
83minus
82[1
07
421
][2
08
524
][3
41
43]
[23
125
][3
41
35]
[minus9
64
5][minus
160
minus
30]
[minus1
59minus
30]
[minus1
60minus
29]
IP1
195
17
93
58
5minus
14minus
26minus
28minus
31[4
62
50]
[21
107
][3
01
65]
[11
70]
[34
170
][minus
47
09]
[minus7
60
2][minus
73
minus01
][minus
77
minus01
]
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
152 Journal of Business amp Economic Statistics April 2006
Tabl
e5
(con
tinue
d)
RV
plowastsum ie
2 itr
sum ie2 i
ask
sum ie2 i
mid
sum ie2 i
bid
sum ieiylowast i
trsum ie
iylowast i
ask
sum ieiylowast i
mid
sum ieiylowast i
bid
JNJ
81
40
50
27
57
minus06
minus21
minus23
minus24
[33
166
][2
56
3][2
29
8][0
95
3][2
69
9][minus
24
05]
[minus5
5minus
01]
[minus5
7minus
03]
[minus5
6minus
02]
JPM
108
60
74
36
82
0minus
23minus
23minus
24[4
62
43]
[33
98]
[31
164
][1
19
2][3
71
71]
[minus2
41
3][minus
79
01]
[minus7
7minus
01]
[minus8
30
0]K
O9
55
36
32
76
3minus
02minus
20minus
20minus
20[4
21
85]
[31
87]
[32
113
][0
95
3][3
11
11]
[minus2
61
3][minus
59
00]
[minus5
8minus
01]
[minus5
60
0]M
CD
139
94
105
46
113
04
minus26
minus29
minus31
[60
314
][5
41
49]
[46
209
][1
51
23]
[52
220
][minus
30
26]
[minus1
16
05]
[minus1
07
00]
[minus1
07
06]
MM
M1
093
25
62
75
9minus
20minus
28minus
30minus
32[4
32
11]
[14
62]
[23
111
][1
05
9][2
61
14]
[minus5
2minus
01]
[minus7
1minus
05]
[minus7
5minus
08]
[minus7
7minus
07]
MO
148
49
84
48
89
minus23
minus48
minus49
minus49
[34
317
][2
08
1][2
71
90]
[10
116
][2
81
85]
[minus7
30
7][minus
131
minus
01]
[minus1
24minus
02]
[minus1
28
00]
MR
K1
306
19
04
79
7minus
06minus
35minus
37minus
40[4
23
84]
[21
152
][3
02
50]
[12
140
][3
12
36]
[minus4
31
8][minus
136
00
][minus
140
minus
02]
[minus1
42minus
01]
MS
FT
116
279
41
33
41
08
minus37
minus37
minus37
[50
219
][1
97
395
][1
77
7][1
36
3][1
77
9][minus
22
26]
[minus8
1minus
11]
[minus8
2minus
12]
[minus8
4minus
13]
PG
87
32
58
29
67
minus10
minus22
minus24
minus27
[34
164
][1
35
9][2
31
10]
[10
60]
[31
127
][minus
30
04]
[minus5
2minus
02]
[minus5
2minus
04]
[minus6
4minus
04]
SB
C1
361
249
64
31
020
9minus
26minus
27minus
27[5
62
64]
[74
174
][3
61
77]
[10
95]
[41
199
][minus
27
34]
[minus9
60
5][minus
91
00]
[minus9
10
3]T
165
194
129
50
142
10
minus21
minus23
minus25
[62
332
][1
05
314
][5
72
39]
[15
109
][5
82
74]
[minus2
63
8][minus
84
15]
[minus8
40
8][minus
101
13
]U
TX
119
46
76
33
84
minus16
minus24
minus26
minus29
[45
247
][1
79
0][3
11
56]
[11
66]
[35
158
][minus
52
04]
[minus6
8minus
01]
[minus6
5minus
04]
[minus7
7minus
02]
WM
T1
113
77
34
38
1minus
04minus
33minus
36minus
38[5
32
22]
[18
67]
[38
132
][2
08
3][4
11
43]
[minus3
31
0][minus
74
minus08
][minus
81
minus11
][minus
83
minus11
]X
OM
78
45
55
28
58
05
minus20
minus20
minus21
[34
160
][2
66
9][2
21
11]
[08
63]
[23
120
][minus
09
18]
[minus5
4minus
01]
[minus6
0minus
02]
[minus6
2minus
02]
NO
TE
A
nnua
lave
rage
sof
the
real
ized
varia
nce
ofef
ficie
ntpr
ice
and
nois
epr
oces
ses
corr
espo
ndin
gto
diffe
rent
pric
ese
ries
The
effic
ient
pric
ean
dth
eno
ise
proc
esse
sw
ere
dedu
ced
from
daily
estim
ates
ofa
coin
tegr
atio
nve
ctor
auto
regr
essi
onm
odel
T
henu
m-
bers
inth
esq
uare
dbr
acke
tsar
eth
e5
and
95
quan
tiles
for
the
diffe
rent
stat
istic
s
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 153
Figure 8 The Estimated Efficient Price Over Three 20-Minute Subperiods on April 24 2004 for AA The shaded area shows the best bid andask quotes and the star ( ) represent actual transaction prices
As shown by Hansen (2005) the coefficients of C(L) = C0 +C1L+C2L2+middot middot middot are given recursively by the YulendashWalker typeequation
Ci =Ciminus1 +1Ciminus1 + middot middot middot + kminus1Ciminusk+1 for i ge 1
(9)
where Ci = Ci minus Ciminus1 C0 = I minus C and Cminus1 = Cminus2 =middot middot middot = minusC An interesting question is how transaction prices andquotes are affected by a change in the efficient price This ques-tion can be addressed with reference to the Granger representa-tion
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
154 Journal of Business amp Economic Statistics April 2006
Although the Granger representation theorem is an algebraicresult that does not rely on distributional assumptions inter-pretations drawn from it do rely on additional assumptions(see Hansen 2005) Under the restrictive Gaussian assumptionswith homoscedastic innovations equiv var(εti) it follows that
dpti+h
dplowastti= dpti+h
dεtitimes dεti
d(αprimeperpεti)times d(αprimeperpεti)
dplowastti
= (C+Ch)timesαperp1
αprimeperpαperptimes αprimeperpβperp
= δ(C+Ch)αperp = ι+ δChαperp
where
δ equiv αprimeperpβperpαprimeperpαperp
Given the lack of results that hold under weaker (and more ap-propriate) assumptions we compute estimates of ι+ δChαperph = 01 and interpret these as approximate impulse re-sponse functions (IRFs) Thus these ldquoprojectionsrdquo should beviewed as only indicative of the dynamic effect on bid askand transaction prices as a response to a change in the efficientprice The long-run effect is unity for all prices as it should bebecause dpti+hdplowastti rarr ι as h rarrinfin This follows from Ch rarr 0as h rarrinfin which holds under the standard I(1) assumptions(see eg Hansen 2005)
Figure 9 displays the estimated IRFs for transaction pricesbid and ask quotes as a response to a change in the efficientprice These results are based on the daily estimates for 2004The solid lines correspond to the average IRFs the dashed linesare the median IRF and the shaded area is bounded by the5 and 95 quantiles across the days in 2004 Although mostof the price change occurs instantaneously the estimated IRFssuggest that it takes about 10 transactions before the full ef-fect is absorbed into transaction and quoted prices for AA andonly 3ndash5 transactions for MSFT Also note that the instanta-neous effect on quotes is larger for MSFT than for AA (asymp9compared with asymp8) and the total effect on quoted prices isabsorbed quicker This is likely explained by the differencesbetween specialist quotes on the NYSE and the competitivequotes on NASDAQ
7 SUMMARY AND CONCLUDING REMARKS
We have analyzed the properties of market microstructurenoise and its effect on empirical measures of volatility Our useof kernel-based estimators revealed several important proper-ties about market microstructure noise and we have shown thatkernel-based estimators are very useful in this context
More importantly our empirical analysis uncovered severalcharacteristics of market microstructure noise the most notableof which are that the noise process is time-dependent and thatthe noise process is correlated with the latent efficient returnsThese results were established for both transaction data andquotation data and were found to hold for intraday returns basedon both calendar time and tick time sampling
We also evaluated the accuracy of distributional results basedon an assumption that there is no market microstructure noiseWe showed that a ldquono-noiserdquo RMSE based on the results of
Barndorff-Nielsen and Shephard (2002) provides a reasonablyaccurate approximation when intraday returns are sampled atlow frequencies such as 20-minute sampling At low sam-pling frequencies small-sample issues can arise such that cov-erage probabilities may be inaccurate (see eg Gonccedilalves andMeddahi 2005) but market microstructure noise is not to beblamed for such issues When intraday returns are sampled athigher frequencies the accuracy of ldquono-noiserdquo approximationsis likely to deteriorate The analogous ldquono-noiserdquo confidence in-terval about RV(m)
AC1yields a more accurate approximation than
that of RV(m) but both are very misleading when intraday re-turns are sampled at high frequencies
Several results in the literature (including our theoretical re-sults given in Sec 3) that analyze volatility estimation fromhigh-frequency data contaminated with market microstructurenoise have assumed that the noise process is independent ofthe efficient price and uncorrelated in time Our empirical re-sults suggest that the implications of these assumptions mayhold (at least approximately) when intraday returns are sam-pled at relatively low frequencies Thus the conclusions of thesearticles may hold as long as intraday returns are not sampledmore frequently than say every 30 ticks On the other handour empirical results have also shown that sampling at ultra-high frequencies such as every few ticks necessitates moregeneral assumptions about the dependence structure of marketmicrostructure noise We established these results in Section 4where we used a general specification for the noise processthat can accommodate both types of dependency Our cointe-gration analysis produced results consistent with both formsof dependencies Volatility signature plots revealed a negativenoisendashprice correlation in quoted price series and the cointe-gration analysis showed that the negative correlation is foundin all price series including transaction prices
Although the main focus of our analysis has been on theproperties of market microstructure noise our analysis also pro-vides some insight into the problem of volatility estimation inthe presence of market microstructure noise Under the inde-pendent noise assumption our comparison of RV(m)
AC1to the
standard measure of realized variance revealed a substantial im-provement in the precision because the theoretical reduction ofthe RMSE is about 25ndash50 These gains were achieved witha simple bias correction that incorporates the first-order autoco-variance and additional improvements are possible with moresophisticated corrections of the realized variance For examplethe kernel estimators of Barndorff-Nielsen et al (2004) and sub-sample estimators of Zhang et al (2005) and Zhang (2004) havebetter asymptotic properties than RV(m)
AC1 Among the estimators
that we have analyzed in this article RV(1 tick)ACNW30
appears to cap-ture the time dependence found in the noise However there arepotential gains from allowing for some bias in exchange for areduction of the variance This may particularly be the case inrecent years where we find the noise (and hence the bias) tobe very small Thus correcting for a smaller number of auto-covariances may be better in terms of the RMSE so RV(1 tick)
ACNW10
may be a better estimator than RV(1 tick)ACNW30
although the formeris more likely to be biased
Although the literature has made great progress on the prob-lem of estimating the quadratic variation in the presence of
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 155
Figure 9 The Estimated IRFs (average over days in the year 2004) for Transaction Prices and Bid and Ask Quotes That Show the Dynamic Effectof an Increase in the Efficient Price The left panels are for AA and the right panels are for MSFT The solid lines correspond to the average IRFsthe dashed lines are the median IRF and the shaded area is bounded by the 5 and 95 quantiles
noise many aspects of this problem are not yet fully under-stood mainly because market microstructure noise appears tobe more complex than can be accommodated by a simple spec-
ification for the noise Furthermore the existing estimators ofvolatility may be improved on by incorporating additional in-formation such as volume number of transactions per day and
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
156 Journal of Business amp Economic Statistics April 2006
the information from limit order books Exploiting the discrete-ness of the data (as in Large 2005 Oomen 2006) is anotherpossible avenue for improving existing estimators
ACKNOWLEDGMENTS
The authors thank seminar participants at University ofCopenhagen University of Aarhus Nuffield College CarnegieMellon University the Econometric Forecasting and High-Frequency Data Analysis Symposium in Singapore the con-ference Innovations in Financial Econometrics in Celebrationof the 2003 Nobel and the 2005 PrincetonndashChicago Confer-ence on the Econometrics of High-Frequency Financial Datafor many valuable comments They are particularly grateful toFrederico Bandi Albert Chun Eric Ghysels Joel HasbrouckJeremy Large Bruce Lehmann Per Mykland Neil Shephardtwo anonymous referees and Torben G Andersen (the editor)for their suggestions that improved this manuscript All errorsremain the responsibility of the authors Financial support fromthe Danish Research Agency (grant no 24-00-0363) is grate-fully acknowledged
APPENDIX A PROOFS
As stated earlier we condition on σ 2(t) in our analysisThus without loss of generality we treat σ 2(t) as a determinis-tic function in our derivations
Proof of Lemma 1
First we note that p(τ ) is continuous and piecewise linearon [ab] Thus p(τ ) satisfies the Lipschitz condition exist δ gt 0such that |p(τ ) minus p(τ + ε)| le δε for all ε gt 0 and all τ Withε = (bminus a)m we have that
msumi=1
y2im le
msumi=1
δ2(bminus a)2mminus2 = δ2(bminus a)2
m
which demonstrates that RV(m) prarr 0 as m rarrinfin and that theQV of p(τ ) is 0 with probability 1
Proof of Theorem 1
From the identities RV(m) = summi=1[ylowast2
im + 2ylowastimeim + e2im]
and E(summ
i=1 ylowastimeim) = ρm it follows that the bias is given bysummi=1 2E(ylowastimeim) + E(e2
im) = 2ρm + mE(e2im) The result of
the theorem now follows from the identity
E(e2im)= E
[u(iδm)minus u((iminus 1)δm)
]2 = 2[π(0)minus π(δm)]given Assumption 2
Proof of Corollary 1
Because m = (bminus a)δm we have that
limmrarrinfinm[π(0)minus π(δm)] = lim
mrarrinfin(bminus a)π(0)minus π(δm)
δm
= minus(bminus a)π prime(0)
under the assumption that π prime(0) is well defined
Proof of Lemma 2
The bias follows directly from the decomposition y2im =
ylowast2im + e2
im + 2ylowastimeim because E(e2im) = E(uim minus uiminus1m)2 =
E(u2im) + E(u2
iminus1m) minus 2E(uimuiminus1m) = 2ω2 where we haveused Assumption 3(a) and (b) Similarly we see that
var(RV(m)
)= var
(msum
i=1
ylowast2im
)+ var
(msum
i=1
e2im
)
+ 4 var
(msum
i=1
ylowastimeim
)
because the three sums are uncorrelated The first sum involvesuncorrelated terms such that var(
summi=1 ylowast2
im)=summi=1 var(ylowast2
im)=2summ
i=1 σ 4im where the last equality follows from the Gaussian
assumption For the second sum we find that
E(e4im) = E(uim minus uiminus1m)4 = E(u2
im + u2iminus1m minus 2uimuiminus1m)2
= E(u4im + u4
iminus1m + 4u2imu2
iminus1m + 2u2imu2
iminus1m)+ 0
= 2micro4 + 6ω4
and
E(e2ime2
i+1m) = E(uim minus uiminus1m)2(ui+1m minus uim)2
= E(u2im + u2
iminus1m minus 2uimuiminus1m)
times (u2i+1m + u2
im minus 2ui+1muim)
= E(u2im + u2
iminus1m)(u2i+1m + u2
im)+ 0
= micro4 + 3ω4
where we have used Assumption 3(a)ndash(c) Thus var(e2im) =
2micro4 + 6ω4 minus [E(e2im)]2 = 2micro4 + 2ω4 and cov(e2
im e2i+1m) =
micro4 minus ω4 Because cov(e2im e2
i+hm) = 0 for |h| ge 2 it followsthat
var
(msum
i=1
e2im
)=
msumi=1
var(e2im)+
msumij=1i =j
cov(e2im e2
jm)
= m(2micro4 + 2ω4)+ 2(mminus 1)(micro4 minusω4)
= 4mmicro4 minus 2(micro4 minusω4)
The last sum involves uncorrelated terms such that
var
(msum
i=1
eimylowastim
)=
msumi=1
var(eimylowastim)= 2ω2msum
i=1
σ 2im
By the substitution 3κω4 = micro4 we obtain the expression for thevariance The asymptotic normality has been proven by Zhanget al (2005) using with an argument similar to one that we usefor RV(m)
AC1in our proof of Lemma 3 and that 2
summi=1 σ 4
im +2ω2 summ
i=1 σ 2im minus 4ω4 = O(1)
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 157
Proof of Lemma 3
First note that RV(m)AC1
= summi=1 Yim + Uim + Vim + Wim
where
Yim equiv ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
Uim equiv (uim minus uiminus1m)(ui+1m minus uiminus2m)
Vim equiv ylowastim(ui+1m minus uiminus2m)
and
Wim equiv (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m)
because yim(yiminus1m + yim + yi+1m)= (ylowastim + uim minus uiminus1m)times(ylowastiminus1m + ylowastim + ylowasti+1m + ui+1m minus uiminus2m) = Yim + Uim +Vim +Wim Thus the properties of RV(m)
AC1are given from those
of Yim Uim Vim and Wim Given Assumptions 1 and 3(a)it follows directly that E(Yim)= σ 2
im and E(Uim)= E(Vim)=E(Wim) = 0 demonstrating E[RV(m)
AC1] =summ
i=1 σ 2im Note that
E(Uim) consists of terms E(uimujm) where i = j so Assump-tion 3(a) suffices to establish that the expected value is 0 GivenAssumptions 1 and 3(a) and (b) the variance of RV(m)
AC1is given
by
var[RV(m)
AC1
] = var
[msum
i=1
Yim +Uim + Vim +Wim
]
= (1)+ (2)+ (3)+ (4)+ (5)
Because all other sums are uncorrelated the five parts aregiven by (1) = var(
summi=1 Yim) (2) = var(
summi=1 Uim) (3) =
var(summ
i=1 Vim) (4) = var(summ
i=1 Wim) and (5) = 2 timescov(
summi=1 Vim
summi=1 Wim) We derive the expressions of these
five terms as follow
1 Yim = ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m) and given Assump-
tion 1 it follows that E[ylowast2imylowast2
jm] = σ 2imσ 2
jm for i = j and
E[ylowast2imylowast2
jm] = E[ylowast4im] = 3σ 4
im for i = j such that
var(Yim) = 3σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m minus [σ 2
im]2
= 2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m
The first-order autocorrelation of Yim is
E[YimYi+1m]= E
[ylowastim(ylowastiminus1m + ylowastim + ylowasti+1m)
times ylowasti+1m(ylowastim + ylowasti+1m + ylowasti+2m)]
= E[ylowastim(ylowastim + ylowasti+1m)ylowasti+1m(ylowastim + ylowasti+1m)] + 0
= 2E[ylowast2imylowast2
i+1m] = 2σ 2imσ 2
i+1m
such that cov(YimYi+1m) = σ 2imσ 2
i+1m whereas cov(Yim
Yi+hm)= 0 for |h| ge 2 Thus
(1) =msum
i=1
(2σ 4im + σ 2
imσ 2iminus1m + σ 2
imσ 2i+1m)
+msum
i=2
σ 2imσ 2
iminus1m +mminus1sumi=1
σ 2imσ 2
i+1m
= 2msum
i=1
σ 4im + 2
msumi=1
σ 2imσ 2
iminus1m + 2msum
i=1
σ 2imσ 2
i+1m
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m
= 6msum
i=1
σ 4im minus 2
msumi=1
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2msum
i=1
σ 2im(σ 2
i+1m minus σ 2im)minus σ 2
1mσ 20m minus σ 2
mmσ 2m+1m
= 6msum
i=1
σ 4im minus 2
msumi=2
σ 2im(σ 2
im minus σ 2iminus1m)
+ 2mminus1sumi=1
σ 2im(σ 2
i+1m minus σ 2im)
minus σ 21mσ 2
0m minus σ 2mmσ 2
m+1m minus 2σ 21m(σ 2
1m minus σ 20m)
+ 2σ 2mm(σ 2
m+1m minus σ 2mm)
= 6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
2 Uim = (uim minus uiminus1m)(ui+1m minus uiminus2m) and fromE(U2
im)= E(uim minusuiminus1m)2E(ui+1m minusuiminus2m)2 it follows thatvar(Uim) = 4ω4 The first- and second-order autocovariancesare given by
E(UimUi+1m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+1m minus uim)(ui+2m minus uiminus1m)]
= E[uiminus1mui+1mui+1muiminus1m] + 0
= ω4
and
E(UimUi+2m) = E[(uim minus uiminus1m)(ui+1m minus uiminus2m)
times (ui+2m minus ui+1m)(ui+3m minus uim)]
= E[uimui+1mui+1muim] + 0
= ω4
whereas E(UimUi+hm) = 0 for |h| ge 3 Thus (2) = m4ω4 +2(mminus 1)ω4 + 2(mminus 2)ω4 = 8ω4mminus 6ω4
3 Vim = ylowastim(ui+1m minus uiminus2m) such that E(V2im) = σ 2
im times2ω2 and E[VimVi+hm] = 0 for all h = 0 Thus (3) =var(
summi=1 Vim)= 2ω2 summ
i=1 σ 2im
4 Wim = (uim minus uiminus1m)(ylowastiminus1m + ylowastim + ylowasti+1m) such that
E(W2im)= 2ω2(σ 2
iminus1m +σ 2im +σ 2
i+1m) The first-order autoco-variance equals
cov(WimWi+1m) = E[minusu2im(ylowast2
im + ylowast2i+1m)]
= minusω2(σ 2im + σ 2
i+1m)
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
158 Journal of Business amp Economic Statistics April 2006
whereas cov(WimWi+hm)= 0 for |h| ge 2 Thus
(4) =msum
i=1
[2ω2(σ 2
iminus1m + σ 2im + σ 2
i+1m)
minusmsum
i=2
ω2(σ 2im + σ 2
iminus1m)minusmminus1sumi=1
ω2(σ 2im + σ 2
i+1m)
]
= ω2msum
i=1
(σ 2iminus1m + σ 2
i+1m)
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im +ω2[σ 2
0m minus σ 2mm + σ 2
m+1m minus σ 21m]
+ω2[σ 21m + σ 2
0m + σ 2mm + σ 2
m+1m]
= 2ω2msum
i=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
5 The autocovariances between the last two terms are givenby
E[VimWi+hm] = E[ylowastim(ui+1m minus uiminus2m)(ui+hm minus uiminus1+hm)
times (ylowastiminus1+hm + ylowasti+hm + ylowasti+1+hm)]
showing that cov(VimWiplusmn1m)= ω2σ 2im whereas all other co-
variances are 0 From this we conclude that
(5) = 2
[2
msumi=1
ω2σ 2im minusω2(σ 2
1m + σ 2mm)
]
= 4ω2msum
i=1
σ 2im minus 2ω2(σ 2
1m + σ 2mm)
Adding up the five terms we find that
6msum
i=1
σ 4im minus 2
mminus1sumi=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m + 8ω4mminus 6ω4
+ 2ω2msum
i=1
σ 2im + 2ω2
msumi=1
σ 2im + 2ω2[σ 2
0m + σ 2m+1m]
+ 4ω2msum
i=1
σ 2im minus 2ω2[σ 2
1m + σ 2mm]
= 8ω4m+ 8ω2msum
i=1
σ 2im minus 6ω4 + 6
msumi=1
σ 4im + Rm
where
Rm equiv minus2msum
i=1
(σ 2i+1m minus σ 2
im)2 minus 2(σ 41m + σ 4
mm)
+ σ 21mσ 2
0m + σ 2mmσ 2
m+1m
+ 2ω2(σ 20m minus σ 2
1m + σ 2m+1m minus σ 2
mm)
Under BTS it follows immediately that Rm = O(mminus2) Un-der CTS we use the Lipschitz condition which states thatexist ε gt 0 such that |σ 2(t)minusσ 2(t+h)| le εh for all t and all h Thisshows that |σ 2
im| = | int timtiminus1m
σ 2(s)ds| le δ suptiminus1mlesletim σ 2(s)=O(mminus1) because δ = δim = (b minus a)m = O(mminus1) under CTSand
|σ 2im minus σ 2
iminus1m| =∣∣∣∣int tim
timinus1m
σ 2(s)minus σ 2(sminus δ)ds
∣∣∣∣
leint tim
timinus1m
|σ 2(s)minus σ 2(sminus δ)|ds
le δ suptiminus1mlesletim
|σ 2(s)minus σ 2(sminus δ)|
le δ2ε = O(mminus2)
Thussumm
i=1(σ2i+1m minus σ 2
im)2 le m middot (δ εm )2 = O(mminus3) which
proves that Rm = O(mminus2) under CTSThe asymptotic normality is established by expressing RV(m)
AC1as a sum of a martingale difference sequence Let uim = u(tim)and define the sigma algebra Fim = σ(ylowastim ylowastiminus1m uim
uiminus1m ) First note that yim(yiminus1m + yim + yi+1m) =σ 2
im + ξ(1)iminus1m + ξ
(2)im + ξ
(3)i+1m where
ξ(1)iminus1m equiv minusuiminus1mylowastiminus1m + uiminus1muiminus2m
ξ(2)im equiv ylowastimylowastim minus σ 2
im + ylowastimylowastiminus1m minus ylowastimuiminus2m
+ uimylowastiminus1m + uimylowastim minus uimuiminus2m minus uiminus1mylowastim
and
ξ(3)i+1m equiv ylowastimylowasti+1m + ylowastimui+1m + uimylowasti+1m
+ uimui+1m minus uiminus1mylowasti+1m minus uiminus1mui+1m
Thus if we define ξim equiv (ξ(1)im + ξ
(2)im + ξ
(3)im) (and use the con-
ventions ξ(2)0m = ξ
(3)0m = ξ
(3)1m = ξ
(1)mm = ξ
(1)m+1m = ξ
(2)m+1m = 0)
then it follows that [RV(m)AC1
minus IV] = summ+1i=0 ξim where ξim
Fim m+1i=0 is a martingale difference sequence that is squared-
integrable because
E(ξ2im)=
ω2σ 20 +ω4 ltinfin for i = 0
2ω4 + σ 20 σ 2
1 +ω2σ 20 + 2ω2σ 2
1 + σ 41 ltinfin
for i = 1
2σ 4im + 4σ 2
imσ 2iminus1m + 4σ 2
imω2
+ 4σ 2iminus1mω2 + 8ω4 ltinfin for 1 lt i lt m
σ 4m + 4σ 2
mσ 2mminus1 + 6ω2σ 2
m + 4ω2σ 2mminus1 + 5ω4 ltinfin
for i = m
σ 2mσ 2
m+1 + 2ω2σ 2m+1 + 2ω4 ltinfin
for i = m+ 1
Because mminus12[RV(m)AC1
minus IV] = mminus12 summ+1i=0 ξim we can ap-
ply the central limit theorem for squared-integrable martingales(see Shiryaev 1995 p 543 thm 4) where the only remain-ing condition to be verified is the conditional Lindeberg con-
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 159
dition
m+1sumi=0
E[mminus1ξ2
im1|mminus12ξim|gtε∣∣Fiminus1m
] prarr 0 as m rarrinfin
Because E[ξ2im1|mminus12ξim|gtε] is bounded by E[ξ2
im] lt infinand supi P(1|ξim|gtε
radicm = 0) rarr 0 for all ε gt 0 it follows
that
E
∣∣∣∣∣mminus1m+1sumi=0
ξ2im1|mminus12ξim|gtε minus 0
∣∣∣∣∣
le mminus1m+1sumi=0
∣∣E[ξ2
im1|mminus12ξim|gtε]minus 0
∣∣
rarr 0 as m rarrinfin
The Lindeberg condition now follows because convergencein L1 implies convergence in probability
Proof of Corollary 2
The MSEs are given from Lemmas 2 and 3 because BTSimplies that the O(mminus2) term in Lemma 3 (see the proof ofLemma 3) is given by
Rm = 0minus 2
(IV2
m2+ IV2
m2
)+ IV2
m2+ IV2
m2+ 0 =minus2IV2
m2
Equating part MSE(RV(m))partm prop 4λ2m + 6λ2 minus mminus2 with 0yields the first-order condition of the corollary and the sec-ond result follows similarly from part MSE(RV(m)
AC1)partm prop 4λ2 minus
3mminus2 + 2mminus3
Proof of Theorem 2
Under CTS we can define δm = (bminusa)m = tim minus timinus1m forall i = 1 m To simplify our notation let uim equiv u(tim) andnote that
E(eimei+hm) = E[uim minus uiminus1m][ui+hm minus ui+hminus1m]= 2π(hδm)minus π((hminus 1)δm)minus π((h+ 1)δm)
= [π(hδm)minus π((h+ 1)δm)
]minus [
π((hminus 1)δm)minus π(hδm)]
such thatqmsum
h=1
E(eimei+hm)
= [π(qmδm)minus π((qm + 1)δm)
]minus [π(0)minus π(δm)]where the first term equals 0 given Assumption 4 Further byAssumption 4 we have that ylowastim is uncorrelated with the noiseprocess when separated in time by at least θ0 Therefore
0 = E[ylowastim
(ui+qmm minus uiminusqmminus1m
)]= E
[ylowastim
(ei+qmm + middot middot middot + eiminusqmm
)]and similarly
0 = E[eim
(ylowasti+qmm + middot middot middot + ylowastiminusqmm
)]
which implies that
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[ylowastimei+hm + ylowastimeiminushm]
and
ρm =msum
i=1
E[ylowastimeim] = minusmsum
i=1
qmsumh=1
E[eimylowasti+hm + eimylowastiminushm]
For h = 0 we have that E(yimyi+hm)= (E[ylowastimei+hm + eim timesylowasti+hm]+E(eimei+hm) because E(ylowastimylowasti+hm)= 0 by Assump-tion 4 It now follows that
E
[ qmsumh=1
msumi=1
yimyiminushm + yimyi+hm
]
=minus2ρm minus 2m[π(0)minus π(δm)]which proves that RV(m)
ACqmis unbiased
APPENDIX B CONFIDENCE INTERVAL FORVOLATILITY SIGNATURE PLOTS
In our empirical analysis we seek a measure that is informa-tive about the precision of our bias-corrected RVs A minimumrequirement is that the sample average of RVAC is in a neigh-borhood of average integrated variance
σ 2 equiv nminus1nsum
t=1
IVt
which may be estimated by
σ 2 = nminus1nsum
t=1
IVt
where we use RV(1 tick)ACNW30t
equiv (RV(1 tick) trACNW30t
+ RV(1 tick) bidACNW30t
+RV(1 tick) ask
ACNW30t)3 as our choice for IVt because it is likely to
be conditionally unbiased for IVt Next we consider the prob-lem of constructing a confidence interval for σ 2 AndersenBollerslev Diebold and Labys (2000a 2003) have shown thatthe realized variance is approximately log-normally distributedHere we follow this idea and assume that log IVt sim N(ξω2)such that the unconditional expected value is given by E[IVt] =σ 2 = exp(ξ + ω22) and var(IVt) = [exp(ω2) minus 1] exp(ω2 +2ξ)
Now ξ plusmn σξc1minusα2 is a (1 minus α)-confidence interval for ξ
where ξ = nminus1 sumnt=1 log IVt σ 2
ξequiv var(ξ ) and c1minusα2 is the ap-
propriate quantile from the standard normal distribution Be-cause ξ = log(micro)minusω2 it follows that
P[l le ξ le u] = P[l+ω2 le log(σ 2)le u+ω2]= P
[exp(l+ω2)le σ 2 le exp(u+ω2)
]
such that the confidence interval for ξ can be converted into onefor σ 2 These calculations require that ω2 be known whereas inpractice we must estimate ω2 Thus we define ηt equiv log IVt minus ξ
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
160 Journal of Business amp Economic Statistics April 2006
and use the NeweyndashWest estimator
ω2 equiv 1
nminus 1
nsumt=1
η2t + 2
qsumh=1
(1minus h
q+ 1
)1
nminus h
nminushsumt=1
ηtηt+h
where q = int[4(n100)29]
and subsequently set σ 2ξequiv ω2n An approximate confidence
interval for σ 2 is now given by
CI(σ 2)equiv exp(log(σ 2
)plusmn σξc1minusα2
)
which we recenter about log(σ 2) (rather than ξ+ ω2) because
this ensures that the sample average of IVt is inside the confi-
dence interval σ 2 isin CI(σ 2)
APPENDIX C ESTIMATION OF COINTEGRATIONVECTOR AUTOREGRESSION
To avoid a linear deterministic trend in the price series weimpose the constraint micro = αρ where ρ = (ρ1 ρ2)
prime (Althougha linear deterministic trend may be quite sensible over a verylong period an estimate of such would be mostly spuriouswithin a single day) On the other hand we do not want toentirely exclude the constant because we want to allow for abidndashask spread to have a nontrivial expected value and this isfully accommodated by the restriction micro= αρ
Although the model is simple to estimate by least squareswhen the constant in unrestricted or set to 0 the estimation un-der a restricted constant is slightly more complicated
1 Regress pti on (pprimetiminus1β ιpprimetiminus1
pprimetiminusk+1)prime by least
squares and obtain (α micro 1 kminus1)2 Now set ρ = (αprimeα)minus1αprimemicro (minusρ1 captures average dif-
ference between transaction prices and the mid-quotesand minusρ2 captures ldquoaveragerdquo spread between ask and bidquotes)
3 Regress pti on (pprimetiminus1β + ρprimepprimetiminus1
pprimetiminusk+1)prime by
least squares and obtain (α 1 kminus1)
Define αperp equiv 1ς[ι minus α(αprimeα)minus1αprimeι] where ς equiv ιprimeι minus ιprimeα times
(αprimeα)minus1αprimeι Then it follows that αprimeαperp = 1ς[αprimeιminus αprimeα(αprimeα)minus1times
αprimeι] = 1ς[αprime
ιminus αprimeι] = 0 and ιprimeαperp = 1
ς[ιprimeιminus ιprimeα(αprimeα)minus1αprimeι] = 1
In the impulse response analysis we use the estimator equiv1m
summi=1(εti minus ε)(εti minus ε)prime
[Received February 2004 Revised September 2005]
REFERENCES
Aiumlt-Sahalia Y Mykland P A and Zhang L (2005a) ldquoHow Often to Samplea Continuous-Time Process in the Presence of Market Microstructure NoiserdquoReview of Financial Studies 18 351ndash416
(2005b) ldquoUltra-HighndashFrequency Volatility Estimation With Depen-dent Microstructure Noiserdquo Working Paper 11380 National Bureau of Eco-nomic Research
Amihud Y and Mendelson H (1987) ldquoTrading Mechanisms and Stock Re-turns An Empirical Investigationrdquo Journal of Finance 42 533ndash553
Andersen T G and Bollerslev T (1997) ldquoIntraday Periodicity and VolatilityPersistence in Financial Marketsrdquo Journal of Empirical Finance 4 115ndash158
(1998) ldquoAnswering the Skeptics Yes Standard Volatility Models DoProvide Accurate Forecastsrdquo International Economic Review 39 885ndash905
Andersen T G Bollerslev T and Diebold F X (2003) ldquoSome Like ItSmooth and Some Like It Rough Untangling Continuous and Jump Com-ponents in Measuring Modeling and Forecasting Asset Return Volatilityrdquoworking paper Northwestern University Duke University and University ofPennsylvania
Andersen T G Bollerslev T Diebold F X and Ebens H (2001) ldquoTheDistribution of Realized Stock Return Volatilityrdquo Journal of Financial Eco-nomics 61 43ndash76
Andersen T G Bollerslev T Diebold F X and Labys P (2000a)ldquoExchange Rate Return Standardized by Realized Volatility Are (Nearly)Gaussianrdquo Multinational Finance Journal 4 159ndash179
(2000b) ldquoGreat Realizationsrdquo Risk 13 105ndash108(2003) ldquoModeling and Forecasting Realized Volatilityrdquo Economet-
rica 71 579ndash625Andersen T G Bollerslev T and Meddahi N (2004) ldquoAnalytic Evaluation
of Volatility Forecastsrdquo International Economic Review 45 1079ndash1110Andreou E and Ghysels E (2002) ldquoRolling-Sample Volatility Estimators
Some New Theoretical Simulation and Empirical Resultsrdquo Journal of Busi-ness amp Economic Statistics 20 363ndash376
Andrews D W K (1991) ldquoHeteroskedasticity- and Autocorrelation-Con-sistent Covariance Matrix Estimationrdquo Econometrica 59 817ndash858
Awartani B Corradi V and Distaso W (2004) ldquoTesting and Modelling Mar-ket Microstructure Effects With an Application to the Dow Jones IndustrialAveragerdquo unpublished manuscript Queen Mary University of London andUniversity of Exeter
Bai X Russell J R and Tiao G C (2004) ldquoEffects of Non-Normality andDependence on the Precision of Variance Estimates Using High-FrequencyFinancial Datardquo working paper University of Chicago Graduate School ofBusiness
Baillie R T Booth G G Tse Y and Zabotina T (2002) ldquoPrice Discoveryand Common Factor Modelsrdquo Journal of Financial Markets 5 309ndash321
Bandi F M and Phillips P C (2004) ldquoA Simple Approach to the Paramet-ric Estimation of Potentially Nonstationary Diffusionsrdquo unpublished manu-script University of Chicago and Yale University
Bandi F M and Russell J R (2005) ldquoMicrostructure Noise Realized Volatil-ity and Optimal Samplingrdquo working paper University of Chicago GraduateSchool of Business
Barndorff-Nielsen O E Hansen P R Lunde A and Shephard N (2004)ldquoRegular and Modified Kernel-Based Estimators of Integrated Variance TheCase With Independent Noiserdquo working paper Stanford University Dept ofEconomics httpwwwstanfordedupeoplepeterhansen
Barndorff-Nielsen O E Nielsen B Shephard N and Ysusi C (1996)ldquoMeasuring and Forecasting Financial Variability Using Realised VarianceWith and Without a Modelrdquo in State Space and Unobserved ComponentsModels Theory and Application eds A C Harvey S J Koopman andN Shephard Cambridge UK Cambridge University Press pp 205ndash235
Barndorff-Nielsen O E and Shephard N (2002) ldquoEconometric Analysis ofRealised Volatility and Its Use in Estimating Stochastic Volatility ModelsrdquoJournal of the Royal Statistical Society Ser B 64 253ndash280
(2003) ldquoRealized Power Variation and Stochastic Volatilityrdquo Ber-noulli 9 243ndash265
(2004) ldquoPower and Bipower Variation With Stochastic Volatility andJumpsrdquo (with discussion) Journal of Financial Econometrics 2 1ndash48
(2006a) ldquoEconometrics of Testing for Jumps in Financial EconomicsUsing Bipower Variationrdquo Journal of Financial Econometrics 4 41ndash30
(2006b) ldquoImpact of Jumps on Returns and Realised Variances Econo-metric Analysis of Time-Deformed Levy Processesrdquo Journal of Economet-rics 131 217ndash252
(2007) ldquoVariation Jumps Market Frictions and High-Frequency Datain Financial Econometricsrdquo in Advances in Economics and Economet-rics Theory and Applications Ninth World Congress eds R BlundellT Persson and W K Newey Econometric Society Monographs Cam-bridge UK Cambridge University Press
Billingsley P (1995) Probability and Measure (3nd ed) New York WileyBlack F (1976) ldquoNoiserdquo Journal of Finance 41 529ndash543Bollen B and Inder B (2002) ldquoEstimating Daily Volatility in Financial Mar-
kets Utilizing Intraday Datardquo Journal of Empirical Finance 9 551ndash562Bollerslev T Kretschmer U Pigorsch C and Tauchen G (2005) ldquoThe Dy-
namics of Bipower Variation Realized Volatility and Returnsrdquo unpublishedmanuscript Duke University Dept of Economics
Christensen K and Podolskij M (2005) ldquoAsymptotic Theory for Range-Based Estimation of Integrated Variance of a Continuous Semi-Martingalerdquoworking paper Aarhus School of Business and Ruhr University of Bochum
Corsi F Zumbach G Muumlller U and Dacorogna M (2001) ldquoConsistentHigh-Precision Volatility From High-Frequency Datardquo Economic Notes 30183ndash204
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123
Hansen and Lunde Realized Variance and Market Microstructure Noise 161
Curci G and Corsi F (2004) ldquoDiscrete Sine Transform Approach for Real-ized Volatility Measurementrdquo unpublished manuscript University of South-ern Switzerland
Dacorogna M M Gencay R Muumlller U Olsen R B and Pictet O V(2001) An Introduction to High-Frequency Finance London AcademicPress
de Jong F (2002) ldquoMeasures of Contributions to Price Discovery A Compar-isonrdquo Journal of Financial Markets 5 323ndash327
Easley D and OrsquoHara M (1987) ldquoPrice Trade Size and Information in Se-curities Marketsrdquo Journal of Financial Economics 16 69ndash90
(1992) ldquoTime and the Process of Security Price Adjustmentrdquo Journalof Finance 47 576ndash605
Ebens H (1999) ldquoRealized Stock Volatilityrdquo Working Paper 420 Johns Hop-kins University Dept of Economics
Engle R and Sun Z (2005) ldquoForecasting Volatility Using Tick by Tick Datardquounpublished manuscript New York University Stern School of Business
Fang Y (1996) ldquoVolatility Modeling and Estimation of High-Frequency DataWith Gaussian Noiserdquo unpublished doctoral thesis MIT Sloan School ofManagement
French K R Schwert G W and Stambaugh R F (1987) ldquoExpected StockReturns and Volatilityrdquo Journal of Financial Economics 19 3ndash29
Frijns B and Lehnert T (2004) ldquoRealized Variance in the Presence of Non-iid Microstructure Noise A Structural Approachrdquo Working Paper 04-008Limburg Institute of Financial Economics
Ghysels E Santa-Clara P and Valkanov R (2006) ldquoPredicting VolatilityGetting the Most Out of Return Data Sampled at Different FrequenciesrdquoJournal of Econometrics 131 59ndash95
Glosten L and Milgrom P (1985) ldquoBid Ask and Transaction Prices in aSpecialist Market With Heterogeneously Informed Tradersrdquo Journal of Fi-nancial Economics 13 71ndash100
Gonccedilalves S and Meddahi N (2005) ldquoBootstrapping Realized Volatilityrdquounpublished manuscript Deacutepartement de Sciences Eacuteconomiques CIREQand CIRANO Universiteacute de Montreacuteal
Gonzalo J and Granger C W J (1995) ldquoEstimation of Common Long-Memory Components in Cointegrated Systemsrdquo Journal of Business amp Eco-nomic Statistics 13 27ndash36
Hansen P R (2005) ldquoGrangerrsquos Representation Theorem A Closed-Form Ex-pression for I(1) Processesrdquo Econometrics Journal 8 23ndash38
Hansen P R and Johansen S (1998) Workbook on Cointegration OxfordUK Oxford University Press
Hansen P R and Lunde A (2003) ldquoAn Optimal and Unbiased Measure ofRealized Variance Based on Intermittent High-Frequency Datardquo mimeo pre-pared for the CIREQndashCIRANO Conference Realized Volatility MontrealNovember 2003
(2005a) ldquoA Forecast Comparison of Volatility Models Does AnythingBeat a GARCH(11)rdquo Journal of Applied Econometrics 20 873ndash889
(2005b) ldquoA Realized Variance for the Whole Day Based on In-termittent High-Frequency Datardquo Journal of Financial Econometrics 13525ndash544
(2006) ldquoConsistent Ranking of Volatility Modelsrdquo Journal of Econo-metrics 131 97ndash121
Hansen P R Lunde A and Nason J M (2003) ldquoChoosing the Best Volatil-ity Models The Model Confidence Set Approachrdquo Oxford Bulletin of Eco-nomics and Statistics 65 839ndash861
Harris F H D McInish T H Shoesmith G L and Wood R A (1995)ldquoCointegration Error Correction and Price Discovery on InformationallyLinked Security Marketsrdquo Journal of Financial and Quantitative Analysis30 563ndash579
Harris F H D McInish T H and Wood R A (2002) ldquoSecurity Price Ad-justment Across Exchanges An Investigation of Common Factor Compo-nents for Dow Stocksrdquo Journal of Financial Markets 5 277ndash308
Harris L (1990) ldquoEstimation of Stock Variance and Serial Covariance FromDiscrete Observationsrdquo Journal of Financial and Quantitative Analysis 25291ndash306
(1991) ldquoStock Price Clustering and Discretenessrdquo Review of FinancialStudies 4 389ndash415
Hasbrouck J (1995) ldquoOne Security Many Markets Determining the Contri-butions to Price Discoveryrdquo Journal of Finance 50 1175ndash1198
(2002) ldquoStalking the lsquoEfficient Pricersquo in Market Microstructure Spec-ifications An Overviewrdquo Journal of Financial Markets 5 329ndash339
(2004) ldquoEmpirical Market Microstructure Economic and StatisticalPerspectives on the Dynamics of Trade in Securities Marketsrdquo lecture notesNew York University Stern School of Business
Huang X and Tauchen G (2005) ldquoThe Relative Contribution of Jumps toTotal Price Variancerdquo Journal of Financial Econometrics 3 456ndash499
Jacod J (1994) ldquoLimit of Random Measures Associated With the Incrementsof a Brownian Semimartingalerdquo unpublished manuscript Laboratorie deProbabilities Universite P and M Curie Paris
Jacod J and Protter P (1998) ldquoAsymptotic Error Distributions for the EulerMethod for Stochastic Differential Equationsrdquo The Annals of Probability26 267ndash307
Jansson M (2004) ldquoThe Error in Rejection Probability of Simple Autocorre-lation Robust Testsrdquo Econometrica 72 937ndash946
Johansen S (1988) ldquoStatistical Analysis of Cointegration Vectorsrdquo Journal ofEconomic Dynamics and Control 12 231ndash254
(1996) Likelihood Based Inference in Cointegrated Vector Autoregres-sive Models (2nd ed) Oxford UK Oxford University Press
Kiefer N M Vogelsang T J and Bunzel H (2000) ldquoSimple Robust Testingof Regression Hypothesesrdquo Econometrica 68 695ndash714
Koopman S J Jungbacker B and Hol E (2005) ldquoForecasting Daily Vari-ability of the SampP100 Stock Index Using Historical Realised and ImpliedVolatility Measurementsrdquo Journal of Empirical Finance 12 445ndash475
Large J (2005) ldquoEstimating Quadratic Variation When Quoted Prices Jumpby a Constant Incrementrdquo Economics Group Working Paper W05 NuffieldCollege
Lehmann B (2002) ldquoSome Desiderata for the Measurement of Price Discov-ery Across Marketsrdquo Journal of Financial Markets 5 259ndash276
Maheu J M and McCurdy T H (2002) ldquoNonlinear Features of Realized FXVolatilityrdquo Review of Economics amp Statistics 84 668ndash681
Meddahi N (2002) ldquoA Theoretical Comparison Between Integrated and Real-ized Volatilityrdquo Journal of Applied Econometrics 17 479ndash508
Merton R C (1980) ldquoOn Estimating the Expected Return on the Market AnExploratory Investigationrdquo Journal of Financial Economics 8 323ndash361
Muumlller U A (1993) ldquoStatistics of Variables Observed Over Overlapping In-tervalsrdquo discussion paper OampA Research Group
Muumlller U A Dacorogna M M Olsen R B Pictet O V Schwarz Mand Morgenegg C (1990) ldquoStatistical Study of Foreign Exchange RatesEmpirical Evidence of a Price Change Scaling Law and Intraday AnalysisrdquoJournal of Banking and Finance 14 1189ndash1208
Mykland P A and Zhang L (2006) ldquoANOVA for Diffusions and ItoProcessesrdquo The Annals of Statistics 34 forthcoming
Newey W and West K (1987) ldquoA Simple Positive Semi-Definite Hetero-skedasticity- and Autocorrelation-Consistent Covariance Matrixrdquo Econo-metrica 55 703ndash708
OrsquoHara M (1995) Market Microstructure Theory London BlackwellOomen R A C (2002) ldquoModelling Realized Variance When Returns Are Se-
rially Correlatedrdquo unpublished manuscript University of Warwick WarwickBusiness School
(2005) ldquoProperties of Bias-Corrected Realized Variance Under Alter-native Sampling Schemesrdquo Journal of Financial Econometrics 3 555ndash577
(2006) ldquoProperties of Realized Variance Under Alternative SamplingSchemesrdquo Journal of Business amp Economic Statistics 24 219ndash237
Owens J P and Steigerwald D G (2005) ldquoNoise-Reduced Realized Volatil-ity A Kalman Filter Approachrdquo unpublished manuscript Victoria Universityof Wellington and University of California Santa Barbara
Patton A (2005) ldquoVolatility Forecast Evaluation and Comparison Using Im-perfect Volatility Proxiesrdquo unpublished manuscript London School of Eco-nomics
Protter P (2005) Stochastic Integration and Differential Equations New YorkSpringer-Verlag
Roll R (1984) ldquoA Simple Implicit Measure of the Effective BidndashAsk Spreadin an Efficient Marketrdquo Journal of Finance 39 1127ndash1139
Shiryaev A N (1995) Probability (2nd ed) New York Springer-VerlagStein M L (1987) ldquoMinimum Norm Quadratic Estimation of Spatial Vari-
ogramsrdquo Journal of the American Statistical Association 82 765ndash772Tauchen G and Zhou H (2004) ldquoIdentifying Realized Jumps on Financial
Marketsrdquo unpublished manuscript Duke University Dept of EconomicsWasserfallen W and Zimmermann H (1985) ldquoThe Behavior of Intraday Ex-
change Ratesrdquo Journal of Banking and Finance 9 55ndash72West K D (1997) ldquoAnother Heteroskedasticity- and Autocorrelation-Con-
sistent Covariance Matrix Estimatorrdquo Journal of Econometrics 76 171ndash191Zhang L (2004) ldquoEfficient Estimation of Stochastic Volatility Using Noisy
Observations A Multi-Scale Approachrdquo research paper Carnegie MellonUniversity Dept of Statistics
Zhang L Mykland P A and Aiumlt-Sahalia Y (2005) ldquoA Tale of Two TimeScales Determining Integrated Volatility With Noisy High-Frequency DatardquoJournal of the American Statistical Association 100 1394ndash1411
Zhou B (1996) ldquoHigh-Frequency Data and Volatility in Foreign-ExchangeRatesrdquo Journal of Business amp Economic Statistics 14 45ndash52
(1998) ldquoParametric and Nonparametric Volatility Measurementin Nonlinear Modelling of High-Frequency Financial Time Series edsC L Dunis and B Zhou New York Wiley pp 109ndash123