111
RiskMetrics Journal Volume 8, Number 1 Winter 2008 Volatility Forecasts and At-the-Money Implied Volatility Ination Risk Across the Board Extensions of the Merger Arbitrage Risk Model Measuring the Quality of Hedge Fund Data Capturing Risks of Non-transparent Hedge Funds

Risk Metrics 2008

  • Upload
    stefpri

  • View
    210

  • Download
    14

Embed Size (px)

Citation preview

Page 1: Risk Metrics 2008

RiskMetrics JournalVolume8, Number 1 Winter 2008

› Volatility Forecasts and At-the-MoneyImplied Volatility

› In5ation Risk Across the Board

› Extensions of the Merger Arbitrage Risk Model

› Measuring the Quality of Hedge Fund Data

› Capturing Risks of Non-transparent Hedge Funds

Page 2: Risk Metrics 2008

aRiskMetricsGroupPublication

On the Cover:

Weightswk( ΔT ) as functionofthe forecast horizonΔT.

Page11, Figure1

Page 3: Risk Metrics 2008

Editor’s Note

Christopher C. FingerRiskMetrics Group

[email protected]

In this, the 2008 issue of theRiskMetrics Journal, we present five papers treating types of risk that are

inadequately served by what we might call the “standard riskmodel”. By the standard model, we

refer to a model that takes the natural observed market risk factors for each instrument, and models

these factors with some continuous process. Applying this framework to less traditional asset classes

presents a variety of challenges: infrequent (or non-existent) market observations, inconsistencies in

conventions across instrument types and positions with risks not represented in the natural data

choices. The papers herein represent a number of our effortsto tackle these challenges and to extend

the standard model.

In the first paper, Gilles Zumbach considers volatility risk. Though this is a subject we have treated in

the past, we are still faced with the problem of modeling implied volatility as a risk factor when

consistent observations of implied volatility are not available. Gilles examines a number of processes

for option underlyings, and studies the processes for volatility that these underlying processes induce.

He shows empirically that these induced volatility processes do a nice job of capturing the dynamics

of market implied volatility. Further, he compares the induced volatility to some standard pricing

models and shows that while the induced processes do have a similar structure to these models, they

ultimately describe a different evolution of volatility—aresult that may have greater implications on

how we treat volatility as a risk factor in the future.

In the next article, Fabien Couderc presents a framework forassessing inflation risk. Inflation is

another topic we have examined previously, but the development of a variety of inflation products has

meant that what at one point was a natural choice of risk factors is no longer adequate. To achieve a

consistent and well behaved risk factor, Fabien proposes the use of break-even inflation, with

adjustments to filter out both predictable effects (such as seasonality) and specific market conventions.

Our third article, by Stephane Daul, is an extension of his article on merger arbitrage from last year.

Merger arbitrage positions would seem to be nothing more than equity pairs, but the nature of the

participants in a proposed merger—in particular, that the equities are no longer well represented by

©2008 RiskMetrics Group, Inc. All Rights Reserved.

Page 4: Risk Metrics 2008

2 Editor’s Note

their historical price moves—requires a specific model. St´ephane extends the validation of his model

from last year, and presents an empirical model for forecasting the probability of merger success.

Interestingly, this simple model appears to outperform themarket-implied probability, confirming the

opportunities in this area of trading.

The final two articles both analyze hedge fund returns. Treated as a set of positions, of course, we

may treat hedge fund portfolios with a standard risk model aslong as the model is sophisticated

enough for all of the fund’s positions. But in the absence of position information, hedge fund returns

themselves present a new risk modeling challenge. One aspect of the challenge is the quality of the

hedge fund return data. Daniel Straumann examines this issue in our fourth paper, introducing a

scoring mechanism for hedge fund data quality. Daniel then investigates which aspects of hedge funds

most influence the quality of the return data, and quantifies the performance bias induced by funds

with poor data quality.

Finally, in our last paper, Stephane Daul introduces a model for hedge fund returns. The unique aspect

of this model is to treat hedge fund returns properly as time series, in order to capture the dynamics

that we observe empirically. Through a backtesting exercise, Stephane demonstrates the efficacy of

the model, in particular the improvements it brings over thetypical approaches in the hedge fund

investment literature.

Page 5: Risk Metrics 2008

Volatility Forecasts and At-the-Money Implied Volatility

Gilles ZumbachRiskMetrics Group

[email protected]

This article explores the relationship between realized volatility, implied volatility and severalforecasts for the volatility built from multiscale linear ARCH processes. The forecasts are derivedfrom the process equations, and the parameters set according to different risk methodologies(RM1994, RM2006). An empirical analysis across multiple time horizons shows that a forecastprovided by an I-GARCH(1) process (one time scale) does not capture correctly the dynamic ofthe realized volatility. An I-GARCH(2) process (two time scales, similar to GARCH(1,1)) isbetter, but only a long memory LM-ARCH process (multiple time scales) replicates correctly thedynamic of the realized volatility. Moreover, the forecasts provided by the LM-ARCH process areclose to the implied volatility. The relationship between market models for the forward varianceand the volatility forecasts provided by ARCH processes is investigated. The structure of theforecast equations is identical, but with different coefficients. Yet the process equations for thevariance induced by the process equations for an ARCH model are very different from thosepostulated for a market model, and not of any usual diffusivetype when derived from ARCH.

1 Introduction

The intuition behind volatility is to measure price fluctuations, or equivalently, the typical magnitude

for the price changes. Yet beyond the first intuition, volatility is a fairly complex concept for various

reasons. First, turning this intuition into formulas and numbers is partly arbitrary, and many

meaningful and useful definitions of volatilities can be given. Second, the volatility is not directly

“observed” or traded, but rather computed from time series (although this situation is changing

indirectly through the ever increasing and sophisticated option market and the volatility indexes). For

trading strategies, options and risk evaluations, the valuable quantity is the realized volatility, namely

the volatility that will occur between the current timet and some time in the futuret +∆T. As this

quantity is not available at timet, a forecast needs to be constructed. Clearly, a better forecast of the

realized volatility improves option pricing, volatility trading and portfolio risk management.

At a timet, a forecast for the realized volatility can be constructed from the (underlying) price time

series. In this paper, multiscale ARCH processes are used. On the other hand, a liquid option market

furnishes the implied volatility, corresponding to the “market” forecast for the realized volatility. On

Page 6: Risk Metrics 2008

4 Volatility Forecasts and At-the-Money Implied Volatility

the theoretical side, an “instantaneous”, or effective, volatility σeff is needed to define processes, and

the forward variance. Therefore, at a given timet, we have mainly one theoretical instantaneous

volatility and three notions of “observable” volatility (forecasted, implied and realized). This paper

studies the empirical relationship between these three time series, as a function of the forecast horizon

∆T.1

The main line of this work is to model the underlying time series by multicomponent ARCH

processes, and toderivean implied volatility forecast. This forecast is close to the implied volatility

for the at-the-money option. Such an approach produces a volatility surface based only on the

underlying time series, and therefore a surface can be inferred even when option data is poor or not

available. This article does not address the issue of the full surface, but only the implied volatility for

the at-the-money options, called thebackbone.

A vast literature on implied volatility and its dynamic already exists. In this article, we will review

some recent developments on market models for the forward variance. These models focus on the

volatility as a process, and many process equations can be set that are compatible with a martingale

condition for the volatility. On the other side, the volatility forecast as induced by a multicomponent

ARCH process leads also to process equations for the volatility itself. These two approaches leading

to the volatility process are contrasted, showing the formal similarity in the structure of the forecasts,

but the very sharp difference in the processes for the volatility. If the price time series behave

according to some ARCH process, then the implication for volatility modeling is far reaching as the

usual structure based on Wiener process cannot be used.

This paper is organized as follows. The required definitionsfor the volatilities and forward variance

are given in the next section. The various multicomponent ARCH processes are introduced in

Section 3, and the induced volatility forecasts and processes given in Section 4 and 5. The market

models and the associated volatility dynamics are presented in Section 6. The relationship between

market models, options and the ARCH forecasts are discussedin Section 7. Section 8 presents an

empirical investigation of the relationship between the forecasted, implied and realized volatilities.

Section 9 concludes.

1There exist already an abundant literature on this topic, and (Poon 2005) published a book summarizing nicely the

available publications (approximately 100 articles on volatility forecast alone!).

Page 7: Risk Metrics 2008

Definitions and setup of the problem 5

2 Definitions and setup of the problem

2.1 General

We assume to be at timet, with the corresponding information setΩ(t). The time increment for the

processes and the granularity of the data is denoted byδt, and is one day in the present work. We

assume that there exists an instantaneous volatility, denoted byσeff(t), which corresponds to the

annualized expected standard deviation of the price in the next time stepδt. This is a useful quantity

for the definitions, but this volatility is essentially unobserved. In a process,σeff gives the magnitude

of the returns, as in (9) below.

2.2 Realized volatility

Therealized volatilitycorresponds to the annualized standard deviation of the returns in the interval

betweent andt +∆T

σ2(t, t +∆T) =1 yearn δt ∑

t<t ′≤t+∆T

r2(t ′) (1)

wherer(t) are the (unannualized) returns measured over the time interval δt, and the ratio 1 year/δt

annualizes the volatility. The empirical section is done with daily data and the returns are evaluated

over a one day interval. If the returns do not overlap in the sum, then∆T = n δt. At the timet, the

realized volatility cannot be evaluated from the information setΩ(t). The realized volatility isthe

useful quantity we would like to forecast and to relate to theimplied volatility.

2.3 Forward variance

Theexpected cumulative varianceis defined by

V(t, t +∆T) =

∫ t+∆T

tdt′ E

[σ2

eff(t′) | Ω(t)

](2)

and theforward varianceby

v(t, t +∆T) =∂V(t, t +∆T)

∂∆T= E

[σ2

eff(t +∆T) | Ω(t)]. (3)

The cumulative variance is an extensive quantity as it is proportional to∆T. For empirical

investigation, it is simpler to work with an intensive quantity as this removes a trivial dependency on

Page 8: Risk Metrics 2008

6 Volatility Forecasts and At-the-Money Implied Volatility

the time horizon. For this reason, the cumulative variance is used only in the theoretical part (hence

also the continuum definition with an integral), whereas theforecasted volatility is used in the

empirical part.

The variance enters into the variable leg of a variance swap,and as such, it is tradable. Related

tradable instruments are the volatility indexes like the VIX (but the relation is indirect as the index is

defined through implied volatility of a basket of options). Because volatility is tradable, the forward

variance should be a martingale

E[v(t ′,T) | Ω(t)

]= v(t,T). (4)

For the volatility, this condition is quite weak as it follows also from the chain rule for conditional

expectation

E[E[

σ2eff(T) | Ω(t ′)

]| Ω(t)

]= E

[σ2

eff(T) | Ω(t)]

for t < t ′ < T (5)

and from the definition of the forward variance as a conditional expectation. Therefore, any forecast

built as a conditional expectation produces a martingale for the forward variance.

At this level, there is a formal analogy with interest rates,with the (zero coupon) interest rate and

forward rate being analogous to the cumulative variance andforward variance. Therefore, some ideas

and equations can be borrowed from the interest rate field. For example, on the modeling side, one

can write process for the cumulative variance or for the variance swap, the later being more

convenient as the martingale condition gives simpler constraints on the possible equations. In this

paper, the ARCH path is followed using a multiscale process for the underlying. The forward variance

is computed as an expectation, and therefore the martingaleproperty follows. In Section 6, this

ARCH approach is contrasted with a direct model for the forward volatility, where the martingale

condition has to be explicitly enforced.

2.4 The forecasted volatility

Theforecasted volatilityis defined by

σ2(t, t +∆T) =1n ∑

t<t ′≤t+∆T

E[σ2

eff(t′) | Ω(t)

]. (6)

Up to a normalization and the transformation of the integralinto a discrete sum, this definition is

similar to the expected cumulative variance.

Page 9: Risk Metrics 2008

Multicomponent ARCH processes 7

2.5 The implied volatility

As usual, the implied volatility is defined as the volatilityto insert into the Black-Scholes equation so

as to recover the market price for the option. The implied volatility σBS(m,∆T) is a function of the

moneynessm and of the time to maturity∆T. The moneyness can be defined is various ways, with

most definitions similar tom' ln(F/K), and withF the forward rateF = Ser ∆T . The (forward)

at-the-money option corresponds tom= 0. Thebackboneis the implied volatility at the money

σBS(∆T) = σBS(m= 0,∆T), as a function of the time to maturity∆T. For a given time to maturity

∆T, the implied volatility as function of moneyness is called the smile.

Intuitively, the implied volatility surface can loosely bedecomposed in backbone× smile. The

rationale for this decomposition is that the two directionsdepend on different option features. The

backbone is related to the expected volatility until the option expiry

σ(t, t +∆T) = σBS(m= 0,∆T)(t) (7)

In the Black-Scholes formula, the volatility appears only through the combination∆T σ2,

corresponding to the cumulative expected variance. In the other direction, the smile is the fudge factor

to remedy the incomplete modeling of the underlying by a Gaussian random walk. The Black-Scholes

model has the key advantage to be solvable, but does not include many stylized facts like

heteroskedasticity, fat-tails, or leverage effect. Theseshortcomings translate into various “features” of

the smile.

In principle, (7) should be checked using empirical data. Yet this comparison raises a number of

issues, on both sides of the equation. On the left-hand side,the variance forecast should be computed

using some equations and the time series for the underlying.The forecasting scheme, with its

estimated parameters, is subject to errors. On the right-hand side, the option market has its own

idiosyncracies, for example related to demand and supply. Such effect can be clearly observed by

computing the implied volatility corresponding to the option bid or ask prices. These points are

discussed in more detail in Section 8. Therefore, (7) shouldbe taken only a first order approximation.

3 Multicomponent ARCH processes

3.1 The general setup

The basic idea of a multicomponent ARCH process is to measurehistorical volatilities using

exponential moving average on a set of time horizons, and to compute the effective volatility for the

Page 10: Risk Metrics 2008

8 Volatility Forecasts and At-the-Money Implied Volatility

next time step as a convex combination of the historical volatilities. A first process along these lines

was introduced in (Dacorogna, Muller, Olsen, and Pictet 1998), and this family of processes was

thoroughly developed and explored in (Zumbach and Lynch 2001; Lynch and Zumbach 2003;

Zumbach 2004). A particular simple process with long memoryis used to build the RM2006 risk

methodology (Zumbach 2006). One of the key advantages of these multicomponent processes is that

forecasts for the variance can be computed analytically. Wewill use this property to explore their

relations with the option implied volatility.

In order to build the process, the historical volatilities are measured by exponential moving averages

(EMA) at time scalesτk

σ2k(t) = µk σ2

k(t−δt)+(1−µk) r2(t) k = 1, · · · ,n (8)

and with decay coefficientsµk = exp(−δt/τk). The process time increment,δt, is one day in this

work. Let us emphasize that theσk are computed from historical data, and there is no hidden

stochastic processes like in a stochastic volatility model.

The “effective” varianceσ2eff is a convex combination of theσ2

k and of the mean varianceσ2∞

σ2eff(t) =

n

∑k=1

wk σ2k(t)+w∞ σ2

∞ = σ2∞ +

n

∑k=1

wk(σ2

k(t)−σ2∞),

1 =n

∑k=1

wk +w∞.

Finally, the price follows a random walk with volatilityσeff

r(t +δt) = σeff(t) ε(t +δt). (9)

Depending on the number of componentsn, the time horizonsτk and weightswk, a number of

interesting processes can be built. The processes we are using to compare with implied volatility are

given in the next subsections.

On general ground, we make the distinction between affine processes for which the mean volatility is

fixed byσ∞ andw∞ > 0, and the linear process for whichw∞ = 0. The linear and affine terms qualify

the equations for the variance. The linear processes are very interesting for forecasting volatility as

they have no mean volatility parameterσ∞, which clearly would be time series dependent. However,

their asymptotic properties are singular, and affine processes should be used in Monte Carlo

simulations. This subtle difference between both classes of processes is discussed in details in

(Zumbach 2004). As this paper deal with volatility forecasts, only the linear processes are used.

Page 11: Risk Metrics 2008

Multicomponent ARCH processes 9

3.2 I-GARCH(1)

The I-GARCH(1) model corresponds to a 1-component linear process

σ2(t) = µ σ2(t −δt)+(1−µ) r2(t)

σ2eff(t) = σ2(t).

It has one parameterτ (or equivalentlyµ). This process is equivalent to the integrated GARCH(1,1)

process (Engle and Bollerslev 1986), and with a given value for µ is equivalent to the standard

RiskMetrics methodology (RM1994). Its advantage is to be the most simple, but it does not capture

mean reversion for the forecast (that is, that long term forecasts should converge to the mean

volatility).

For the empirical evaluation, the characteristic time has been fixeda priori to τ = 16 business days,

corresponding toµ' 0.94.

3.3 I-GARCH(2) and GARCH(1,1)

The I-GARCH(2) process corresponds to a two-component linear model

σ21(t) = µ1 σ2

1(t−δt)+(1−µ1) r2(t),

σ22(t) = µ2 σ2

2(t−δt)+(1−µ2) r2(t),

σ2eff(t) = w1σ2

1(t)+w2σ22(t).

It has three parametersτ1, τ2 andw1. Even if this process is linear, it has mean reversion for time

scales up toτ2, with σ2 playing the role of the mean volatility.

The GARCH(1,1) process (Engle and Bollerslev 1986) corresponds to the one-component affine

model

σ21(t) = µ1 σ2

1(t−δt)+(1−µ1) r2(t),

σ2eff(t) = (1−w∞)σ2

1(t)+w∞σ2∞.

It has three parametersτ1, w∞ andσ∞. In this form, the analogy between the I-GARCH(2) and

GARCH(1,1) processes is clear, with the long term volatility σ2 playing a similar role as the mean

volatility σ∞.

Page 12: Risk Metrics 2008

10 Volatility Forecasts and At-the-Money Implied Volatility

Given a process, the parameters need to be estimated on a timeseries. GARCH(1,1) is more

problematic with that respect becauseσ∞ is clearly time series dependent. A good procedure is to

estimate the parameters on a moving historical sample, say in a window betweent −∆T ′ andt for a

fixed span∆T ′. With this setup, the mean varianceσ2∞ is essentially the sample variance∑ r2

computed on the estimating window. This is a rectangular moving average, similar to an EMA but for

the weights given to the past. This argument shows that I-GARCH(2) and (a continuously

re-estimated on a moving window) GARCH(1,1) behave similarly. A detailed analysis of both

processes in (Zumbach 2004) show that they have similar forecasting power, with an advantage to

I-GARCH(2).

In this work, we use the I-GARCH(2) process with two parameter sets fixeda priori to some

reasonable values. The first set isτ1 = 4 business days,τ2 = 512 business days,w1 = 0.843 andw2 =

0.157. The second set isτ1 = 16 business days,τ2 = 512 business days,w1 = 0.804 andw2 = 0.196.

The values for the weights are obtained according to the longmemory ARCH process, but with only

two givenτ components.

3.4 Long Memory ARCH

The idea for a long memory process is to use a multicomponent ARCH model with a large number of

components but simple analytical form for the characteristic timeτk and the weightswk. For the long

memory ARCH process, the characteristic timesτk increase as a geometric series

τk = τ1 ρk−1 k = 1, · · · ,n, (10)

while the weights decay logarithmically

wk =1C

(1− ln(τk)/ ln(τ0)) , (11)

C = ∑k

(1− ln(τk)/ ln(τ0)) .

This choice produces lagged correlations for the volatility that decays logarithmically, as observed in

the empirical data (Zumbach 2006). The parameters are takenas for the RM2006 methodology,

namelyτ1 = 4 business days,τn = 512 business days,ρ =√

2 and the logarithmic decay factorτ0 =

1560 days = 6 years .

Page 13: Risk Metrics 2008

Forward variance and multicomponent ARCH processes 11

Figure 1Weights wk(∆T ) as function of the forecast horizon ∆TLong memory process with w∞ = 0.1 and τk = 2, 4, 8, 16, · · · , 256 days. Weight profiles for increasingcharacteristic times τk have decreasing initial values and maximum values going from left to right.

100 101 102 1030

0.05

0.1

0.15

0.2

Forecast horizon ∆T [day]

wei

ghts

wk(

∆T)

4 Forward variance and multicomponent ARCH processes

For multiscale ARCH processes (I-GARCH, GARCH(1,1), long-memory ARCH, etc ...), the forwardvariance can be computed analytically (Zumbach 2004; Zumbach 2006). The idea is to compute theconditional expectation of the process equations, from which iterative relations can be deduced. Then,some algebra and matrix computations produce the following form for the forward variance

v( t , t + ∆T ) = E σ2eff( t + ∆T ) | Ω( t) = σ2

∞ +n

∑k= 1

wk(∆T ) σ2k ( t) − σ2

∞ . (12)

The weights wk(∆T ) can be computed by a recursion formula depending on the decay coefficients µk

and with initial values given by wk = wk(1) . The equation for the forecast of the realized volatility hasthe same form but the weights wk(∆T ) are different.

Let us emphasize that this can be done for all processes in this class (linear and affine). Moreover, theσ2

k( t) are computed from the underlying time series, namely there is no hidden stochastic volatility toestimate. This makes volatility forecasts particularly easy in this framework.

For a multicomponent ARCH process, the intuition for the forecast can be understood from a graph ofthe weights wk(∆T ) as function of the forecast horizon ∆T as given in Figure 1. For short forecasthorizons, the volatilities with the shorter time horizons dominate. As the forecast horizon gets larger,the weights of the short term volatilities decay while the weights of the longer time horizons increase.

Page 14: Risk Metrics 2008

12 Volatility Forecasts and At-the-Money Implied Volatility

Figure 2Sum of the weights ∑k wk(∆T ) = 1− w∞

Same parameters as in Figure 1

100 101 102 1030

0.2

0.4

0.6

0.8

1su

mof

wei

ghts

Forecast horizon ∆T [day]

The weight for a particular horizon τk peaks at a forecast horizon similar to τk, for example theBurgundy curve corresponds to τ = 32 days, and its maximum is around this value. Figure 2 showsthe sum of the volatility coefficients ∑k wk = 1− w∞. This shows the increasing weight of the meanvolatility as the forecast horizon gets longer. Notice that this behavior corresponds to our generalintuition about forecasts: short term forecasts depend mainly on the recent past while long termforecasts need to use more information from the distant past. The nice feature of the multicomponentARCH process is that the forecast weights are derived from the process equations, and that they havea similar content to the process equations (linear or affine, one or multiple time scales).

5 The induced volatility process

The multicomponent ARCH processes are stochastic processes for the return, in which the volatilitiesare convenient intermediate quantities. It is important to realize that the volatilities σk and σeff areuseful and intuitive in formulating a model, but they can be completely eliminated from the equations.An important advantage of this class of process is that the forward variance v( t , t + ∆T ) can becomputed analytically. Going in the opposite direction, we want to eliminate the return, namely toderive the equivalent process equations for the dynamic of the forward variance induced by amulticomponent ARCH process. This will allow us to make contact with some models for theforward variance that are available in the literature and presented in the next section.

Page 15: Risk Metrics 2008

The induced volatility process 13

Equation (8) forσk can be rewritten as

dσ2k(t) = σ2

k(t)−σ2k(t −δt) (13)

= (1−µk)−σ2

k(t−δt)+ ε2(t) σ2eff(t−δt)

= (1−µk)

σ2eff(t−δt)−σ2

k(t −δt)+ (ε2(t)−1) σ2eff(t−δt)

.

The equation can be simplified by introducing the annualizedvariancesvk = 1y/δt σ2k,

veff = 1y/δt σ2eff and a new random variableχ with

χ = ε2−1 such that E [ χ(t) ] = 0, χ(t) > −1. (14)

Assuming that the time incrementδt is small compared to the time scalesτk in the model, the

following approximation can be used

1−µk =δtτk

+O(δt2). (15)

In the present derivation, this expansion is used only to make contact with the usual form for

processes, but no term of higher order are neglected. Exact expressions are obtained by replacing

δt/τk by 1−µk in the equations below.

These notations and approximations allow the equivalent equations

dvk =δtτk

veff −vk +χveff , (16a)

veff = ∑k

wk vk +w∞v∞. (16b)

The process for the forward variance is given by

dv∆T = ∑k

wk(∆T) dvk (17)

with dvτ(t) = v(t, t +∆T)−v(t−δt, t−δt +∆T).

The content of (16a) is the following. The termδt veff −vk/τk gives a mean reversion toward the

current effective volatilityveff at a time scaleτk. This structure is fairly standard, except forveff which

is given by a convex combination of all the variancesvk. Then, the random term is unusual. All the

variances share the same random factorδt χ/τk, which has a standard deviation of orderδt instead of

the usual√

δt appearing in Gaussian model.

An interesting property of this equation is to enforce positivity for vk through a somewhat unusual

mechanism. Equation (16a) can be rewritten as

dvk =δtτk

−vk +(χ+1)veff (18)

Page 16: Risk Metrics 2008

14 Volatility Forecasts and At-the-Money Implied Volatility

Becauseχ ≥ 1, the term(χ+1)veff is never negative, and asδt vk(t−δt)/τk is smaller thanvk(t −δt),

this implies thatvk(t) is always positive (even for a finiteδt). Another difference with the usual

random process is that the distribution forχ is not Gaussian. In particularly ifε has a fat-tailed

distribution—as seems required in order to have a data generating process that reproduce the

properties of the empirical time series—the distribution for χ also has fat tails.

The continuum limit of the GARCH(1,1) process was already investigated by (Nelson 1990). In this

limit, GARCH(1,1) is equivalent to a stochastic volatilityprocess where the variance has its own

source of randomness. Yet Nelson constructed a different limit as above because he fixes the GARCH

parametersα0, α1 andβ1. The decay coefficient is given byα1+β1 = µ and is therefore fixed. With

µ= exp(−δt/τ), fixing µ and taking the limitδt → 0 is equivalent toτ → 0. Because the

characteristic timeτ of the EMA goes to zero, the volatility process becomes independent of the

return process, and the model converges toward a stochasticvolatility model. A more interesting limit

is to takeτ fixed andδt → 0, as in the computation above. Notice that the computation is done with a

finite time incrementδt; the existence of a proper continuum limitδt → 0 for a process defined by

(16b) to (17) is likely not a simple question.

Let us emphasize that the derivation of the volatility process as induced by the ARCH structure

involves only elementary algebra. Essentially, if the price follows an ARCH process (one or multiple

time scales, with or without meanσ∞), then the volatility follows a process according to (16). The

structure of this process involves a random term of orderδt and therefore it cannot be reduced to a

Wiener or Levy process. This is a key difference from the processes used in finance that were

developed to capture the price diffusion.

The implications of (16) are important as they show a key difference between ARCH and stochastic

volatility processes. This has clearly implication for option pricing, but also for risk evaluation. In a

risk context, the implied volatility is a risk factor for anyportfolio that contains options, and it is

likely better to model the dynamic of the implied volatilityby a process with a similar structure.

6 Market model for the variance

In the literature, the models for the implied volatility aredominated by stochastic volatility processes,

essentially assuming that the implied volatility “has its own life”, independent of the underlying. In

this vast literature, a recent direction is to write processes directly for the forward variance. Recent

work in this direction include (Buehler 2006; Bergomi 2005;Gatheral 2007). In this direction, we

present here simple linear processes for the forward variance, and discuss the relation with

Page 17: Risk Metrics 2008

Market model for the variance 15

multicomponent ARCH in the next section.

The general idea is to write a model for the forward variance

v(t, t +∆T) = G(vk(t);∆T), (19)

whereG is a given function of the (hidden) random factorsvk. In principle, the random factors can

appear everywhere in the equation, say for example as a random characteristic time likeτk. Yet

Buehler has showed that strong constraints exist on the possible random factors, for example

forbidding random characteristic time. In this paper, onlylinear model will be discussed, and

therefore the random factor appears as a variancevk.

The dynamic for the random factorvk are given by processes

dvk = µk(v) dt+d

∑α=1

σαk (v) dWα k = 1, · · · ,n. (20)

The processes haved sources of randomnessdWα, and the volatilityσαk (v) can be any function of the

factors.

As such, the model is essentially unconstrained, but the martingale condition (4) for the forward

variance still has to be enforced. Through standard Ito calculus, the variance curve model together

with the martingale condition lead to a constraint betweenG(v;∆T), µ(v) andσ(v)

∂∆TG(v;∆T) =n

∑i=1

µi ∂vi G(v;∆T)+n

∑i, j=1

d

∑α=1

σαi σα

j ∂2vi ,v j

G(v;∆T) (21)

A given functionG is said to be compatible with a dynamic for the factors if thiscondition is valid.

The compatibility constraint is fairly weak, and many processes can be written for the forward

variance that are martingales. As already mentioned, we consider only functionsG that are linear in

the risk factors. Therefore,∂2vi ,v j

G = 0, leading to first order differential equations that can be solved

by elementary techniques. For this class of models, the condition does not involve the volatilityσαk (v)

of the factor, which therefore can be chosen freely.

6.1 Example: one-factor market model

The forward variance is parameterized by

G(v1;∆T) = v∞ +w1 e−∆T/τ1(v1−v∞) (22)

Page 18: Risk Metrics 2008

16 Volatility Forecasts and At-the-Money Implied Volatility

which is compatible with the stochastic volatility dynamic

dv1 = −(v1−v∞)dtτ1

+ γ vβ1 dW for β ∈ [1/2,1]. (23)

The parameterw1 can be chosen freely, and for identification purposes the choicew1 = 1 is often

made. BecauseG is linear inv1, there is no constraint onβ. The valueβ = 1/2 corresponds to the

Heston model,β = 1 to the lognormal model. This model is somewhat similar to the GARCH

process, with one characteristic timeτ1, a mean volatilityv∞, and the volatility of the volatility

(vol-of-vol) γ. This model is not rich enough to describe the empirical forward variance dynamic,

which involves multiple time scale.

6.2 Example: two-factor market model

The linear model with two factors

G(v;∆T) = v∞ +w1 e−∆T/τ1(v1−v∞) (24)

+1

1− τ1/τ2

(−w1 e−∆T/τ1 +(w1+w2) e−∆T/τ2

)(v2−v∞)

= v∞ +w1(∆T) (v1−v∞)+w2(∆T) (v2−v∞) (25)

is compatible with the dynamic

dv1 = −(v1−v2) dt/τ1+ γ vβ1 dW1 (26)

dv2 = −(v2−v∞) dt/τ2+ γ vβ2 dW2.

The parametersw1 andw2 can be chosen freely, and for identification purposes the choicew1 = 1 and

w2 = 0 is often made. Notice the similarity of (25) with the Svensson parameterization for the yield

curve.

The linear model can be solved explicitly forn components, but the∆T dependency in the coefficients

wk(∆T) becomes increasingly complex. It is therefore not natural in this approach to create the

equivalent of a long-memory model with multiple time scales.

7 Market models and options

Assuming a liquid option market, the implied volatility surface can be extracted, and from its

backbone, the forward variancev(t, t +∆T) is computed. At a given timet, given a market model

Page 19: Risk Metrics 2008

Comparison of the empirical implied, forecasted and realized volatilities 17

G(vk(t);∆T), the risk factorsvk(t) are estimated by fitting the functionG(∆T) on the forward

variance curve. It is therefore important for the functionG(∆T) to have enough possible shapes to

accommodate the various forward variance curves. This estimation procedure for the risk factors

gives the initial conditionvk(t). Then, the postulated dynamics for the risk factors induce adynamic

for G, and hence of the forward variance.

Notice that in this approach, there is no relation with the underlying and its dynamic. For this reason,

the possible processes are weakly constrained, and the parameters need to be estimated independently

(say for example the characteristic timesτk). Another drawback of this approach is to rely on the

empirical forward variance curve, and therefore a liquid option market is a prerequisite.

Our choice of notations makes clear the formal analogy of themarket model with the forecasts

produced by a multicomponent ARCH process. Except for the detailed shapes of the functions

wk(∆T), the equations (12) and (25) have the same structure. They are however quite different in

spirit: thevk are computed from the underlying time series in the ARCH approach, whereas in a

market model approach, thevk are estimated from the forward variance curve obtained fromthe

option market. In other words, ARCH leads to a genuine forecast based on the underlying, whereas

the market model provides for a constrained fit of the empirical forward curve. Beyond this formal

analogy, the dynamic for the risk factors are quite different as the ARCH approach leads to the

unusual (16a) whereas market models use the familiar generic Gaussian process in (20).

8 Comparison of the empirical implied, forecasted and reali zed

volatilities

As explained in Section 4, a multicomponent ARCH process provides us with a forecast for the

realized volatility, and the forecast is directly related to the underlying process and its properties. At a

given timet, there are three volatilities (implied, forecasted and realized) for each forecast horizon

∆T. Essentially, the implied and forecasted volatilities aretwo forecasts for the realized volatility. In

this section, we investigate the relationship between these three volatilities and the forecast horizon

∆T. When analyzing the empirical statistics and comparing these three volatilities, several factors

should be kept in mind.

1. For short forecast horizons (∆T = 1 day and 5 days), the number of returns in∆T is small and

therefore the realized volatility estimator (computed with daily data) has a large variance.

2. The forecastability decreases with increasing∆T.

Page 20: Risk Metrics 2008

18 Volatility Forecasts and At-the-Money Implied Volatility

3. The forecast and implied volatilities are “computed” using the same information set, namely the

history up tot. This is different from the realized volatility, computed using the information in

the interval[t, t +∆T]. Therefore, we expect the distance between the forecast andimplied to be

the smallest.

4. The implied volatility has some idiosyncracies related to the option market, for example supply

and demand, or the liquidity of the underlying necessary to implement the replication strategy.

Similarly, an option bears volatility risk, and a related volatility risk premium can be expected.

These particular effects should bias the implied volatility upward.

5. From the raw options and underlying prices, the computations leading to the implied volatility

are complex, and therefore error prone. This includes dependencies on the original data

providers. An example is given by the time series for CAC 40 implied volatility, where during a

given period, the implied volatility above three months jumps randomly between a realistic

value and a much higher value. This is likely created by quotes for the one-year option that are

quite off the “correct” price (see Figure 3). Yet this data quality problem is inherent to the

original data provider and the option market, and reflects the difficulty in computing clean and

reliable implied volatility surfaces.

6. The options are traded for fixed maturity time, whereas theconvenient volatility surface is given

for constant time to maturity. Therefore, some interpolation and extrapolation needs to be done.

In particular, the short times to maturity (one day, five days) need most of the time an

extrapolation, as the options are traded at best with one expiry for each month. This is clearly a

difficult and error prone procedure.

7. The ARCH-based forecasts are dependent on the choice of the process and the associated

parameters.

8. As the forecast horizon increases, the dynamic of the volatility gets slower and the actual

number of independent volatility points decreases (as 1/∆T). Therefore, the statistical

uncertainty on the statistics are increasing with∆T.

Because of the above points, each volatility has some peculiarities, and therefore we do not have a

firm anchor point to base our comparison. Given that we are on afloating ground, our goals are fairly

modest. Essentially, we want to show that processes with oneor two time scales are not good enough,

and that the long-memory process provide for a very good forecast with an accuracy comparable to

the implied volatility. The processes used in the analysis are I-GARCH(1), I-GARCH(2) with two set

Page 21: Risk Metrics 2008

Comparison of the empirical implied, forecasted and realized volatilities 19

Figure 3Volatility time series for USD/EUR (top) and CAC 40 (bottom), six month forecast horizon

2003/01/01 2004/01/01 2005/01/01 2006/01/01

6

8

10

12

14

16

18

vola

tility

[%

]

realizedimpliedI−GARCH(1)I−GARCH(2)LM−ARCH

2003/01/01 2004/01/01 2005/01/01 2006/01/010

10

20

30

40

50

60

vola

tility

[%

]

realizedimpliedI−GARCH(1)I−GARCH(2)LM−ARCH

of parameters and LM-ARCH. The equations for the processes are given in Section 3, along with thevalues for the parameters.

The best way to visualize the dynamic of the three volatilities is to use a movie of the σ[∆T ] timeevolution. On a movie, the properties of the various volatilities, their dynamics and relationships arevery clear. Unfortunately, the present analogic paper does not allow for such a medium, and we haveto rely on conventional statistics to present their properties.

The statistics are computed for two time series: the USD/EUR foreign exchange rate and the CAC 40stock index. The ATM implied volatility data originate from JP Morgan Chase for USD/EUR andEgartech for the CAC 40 index; the underlying prices originate from Reuters. The time series for thevolatilities are shown on Figure 3 for a six-month forecast horizon. The time series are fairly short(about six years for USD/EUR and four years for CAC 40). This clearly makes statistical inferences

Page 22: Risk Metrics 2008

20 Volatility Forecasts and At-the-Money Implied Volatility

difficult, as the effective sample size is fairly small. On the USD/EUR panel, the lagging behavior of

the forecast and implied volatility with respect to the realized volatility is clearly observed. For the

CAC 40, the data sample contains an abrupt drop in the realized volatility at the beginning of 2003.

This pattern was difficult to capture for the models with longterm mean reversion. In 2005 and early

2006, the implied volatility data are also not reliable: first there are two “competing” streams of

implied volatility at∼12% and∼18%, before a period at the end of 2005 where there is likely no

update in the data stream. This shows the difficulty of obtaining reliable implied volatility data, even

from a major data supplier.

For the statistics, in order to ease the comparison between the graphs, all the horizontal and vertical

scales are identical, the color is fixed for a given forecast model, and the line type is fixed for a given

volatility pair. The graphs are presented for the mean absolute error (MAE)

MAE(x,y) =1n∑

t|x(t)−y(t)| , (27)

wheren is the number of terms in the sum. Other measures of distance like root mean square error

give very similar figures. The volatility forecast depends on the ARCH process. The parameters for

the processes have been selecteda priori to some reasonable values, and no optimization was done.

The overall relationship between the three volatilities can be understood from Figure 4. The pair of

volatilities with the closest relationship is the implied and forecasted volatilities, because they are

built upon the same information set. The distance with the realized volatility is larger, with similar

values for implied-realized and forecast-realized. This shows that it is quite difficult to assert which

one of the implied and forecasted volatility provides for a better forecast of the realized volatility. All

the distances have a global U-shape form as function of∆T. This originates in the points 1 and 2

above, and leads to a minimum around one month for the measureof distances. The distance is larger

for shorter∆T because of the bad estimator for the realized volatility, and larger for longer∆T

because of the decreasing forecastability.

Figure 5 shows the distances for given volatility pairs, depending on the process used to build the

forecast. The forecast-implied distance shows clear difference between processes (left panels). The

I-GARCH(1) process lacks mean reversion, an important feature of the volatility dynamic. The

I-GARCH(2) process with parameter set 1 is handicapped by the short characteristic time for the first

EMA (4 days); this leads to a noisy volatility estimator and subsequently to a noisy forecast. The

same process with a longer characteristic time for the first EMA (16 days, parameter set 2) shows

much improved performance up to a time horizon comparable tothe long EMA (260 days). Finally,

the LM-ARCH produces the best forecast. As the forecast becomes better (1 time scale→ 2 time

scales→ multiple time scales), the distance between the implied andforecasted volatilities decreases.

Page 23: Risk Metrics 2008

Comparison of the empirical implied, forecasted and realized volatilities 21

Figure 4

MAE distances between volatility pairs for EUR/USD, grouped by forecast method

The vertical axis gives the MAE for the annualized volatility in %, the horizontal axis the forecast time

interval∆T in days.

I-GARCH(1) I-GARCH(2) parameter set 1

100

101

102

0

1

2

3

4

5

6

7

8

fcst−implfcst−realimpl−real

100

101

102

0

1

2

3

4

5

6

7

8

fcst−implfcst−realimpl−real

I-GARCH(2) parameter set 2 LM-ARCH

100

101

102

0

1

2

3

4

5

6

7

8

fcst−implfcst−realimpl−real

100

101

102

0

1

2

3

4

5

6

7

8

fcst−implfcst−realimpl−real

Page 24: Risk Metrics 2008

22 Volatility Forecasts and At-the-Money Implied Volatility

Figure 5

MAE distances between volatility pairs, grouped by pairs

The vertical axis gives the MAE for the annualized volatility in %, the horizontal axis the forecast time

interval∆T in days.

EUR/USD forecast-implied EUR/USD forecast-realized

100

101

102

0

1

2

3

4

5

6

7

8

I−GARCH(1)I−GARCH(2): param.1I−GARCH(2): param.2LM−ARCH

100

101

102

0

1

2

3

4

5

6

7

8

CAC40 forecast-implied CAC40 forecast-realized

100

101

102

0

1

2

3

4

5

6

7

8

I−GARCH(1)I−GARCH(2): param.1I−GARCH(2): param.2LM−ARCH

100

101

102

0

1

2

3

4

5

6

7

8

Page 25: Risk Metrics 2008

Conclusion 23

For EUR/USD, the mean volatility is around 10% (the precise value depending on the volatility and

time horizon), and the MAE is in the 1 to 2% range. This shows that in this time to maturity range, we

can build a good estimator of the ATM implied volatility based only on the underlying time series.

The distance forecast-realized is larger than the forecast-implied volatility (right panel), with the long

memory process giving the smallest distance. The only exception is the I-GARCH(1) process applied

to the CAC 40 time series, due to the particular abrupt drop inthe realized volatility at early 2003.

This shows the limit of our analysis due to the fairly small data sample. Clearly, to gain statistical

power requires longer time series for implied volatility, as well as a cross-sectional study over many

time series.

9 Conclusion

Themenagea troisbetween the forecasted, implied and realized volatilitiesis quite a complex affair,

where each participants have their own peculiarities. The salient outcome is that the forecasted and

impled volatilities have the closest relationship, while the realized volatility is more distant as it

incorporates a larger information set. This picture is dependent to some extent on the quality of the

volatility forecast: the multiscale dynamic of the long memory ARCH process is shown to capture

correctly the dynamic of the volatility, while the I-GARCH(1) and I-GARCH(2) processes are not rich

enough in their time scale structures. This conclusion falls in line with the RM2006 risk methodology,

where the same process is shown to capture correctly the lagged correlation for the volatility.

The connection with the market model for the forward variance shows the parallel structure of the

volatility forecasts provided by both approaches. However, their dynamics are very different

(postulated for the forward volatility market models, induced by the ARCH structure for the

multicomponent ARCH processes). Moreover, the volatilityprocess induced by the ARCH equations

is of a different type from the usual price process, because the random term is of orderδt instead of√δt used in diffusive equations. This emphasizes a fundamentaldifference between price and

volatility processes. A clear advantage of the ARCH approach is to deliver a forecast based only on

the properties of the underlying time series, with a minimalnumber of parameters that need to be

estimated (none in our case as all the parameters correspondto the values used in RM1994 and

RM2006). This point bring us to a nice and simple common framework to evaluate risks as well as the

implied volatilities of at-the-money options.

For the implied volatility surface, the problem is still notcompletely solved, as the volatility smile

needs to be described in order to capture the full implied volatility surface. Any multicomponent

Page 26: Risk Metrics 2008

24 Volatility Forecasts and At-the-Money Implied Volatility

ARCH process will capture some (symmetric) smile, due to theheteroskedasticity. Moreover, fat tail

innovations will make the smile stronger, as the process becomes increasingly distant from a Gaussian

random walk. Yet, adding an asymmetry in the smile, as observed for stocks and stock indexes,

requires enlarging the family of process to capture asymmetry in the distribution of returns. This is

left for further work.

References

Bergomi, L. (2005). Smile dynamics ii.Risk 18, 67–73.

Buehler, H. (2006). Consistent variance curve models.Finance and Stochastics 10, 178–203.

Dacorogna, M. M., U. A. Muller, R. B. Olsen, and O. V. Pictet (1998). Modelling short-term

volatility with GARCH and HARCH models. In C. Dunis and B. Zhou (Eds.),Nonlinear

Modelling of High Frequency Financial Time Series, pp. 161–176. John Wiley.

Engle, R. F. and T. Bollerslev (1986). Modelling the persistence of conditional variances.

Econometric Reviews 5, 1–50.

Gatheral, J. (2007). Developments in volatility derivatives pricing. Presentation at “Global

derivatives”, Paris, May 23.

Lynch, P. and G. Zumbach (2003, July). Market heterogeneities and the causal structure of

volatility. Quantitative Finance 3, 320–331.

Nelson, D. (1990). Arch model as diffusion approximation.Journal of Econometrics 45, 7–38.

Poon, S.-H. (2005).Forecasting financial market volatility. Wiley Finance.

Zumbach, G. (2004). Volatility processes and volatility forecast with long memory.Quantitative

Finance 4, 70–86.

Zumbach, G. (2006). The RiskMetrics 2006 methodology. Technical report, RiskMetrics Group.

Available at www.riskmetrics.com.

Zumbach, G. and P. Lynch (2001, September). Heterogeneous volatility cascade in financial

markets.Physica A 298(3-4), 521–529.

Page 27: Risk Metrics 2008

Inflation Risk Across the Board

Fabien CoudercRiskMetrics Group

[email protected]

Inflation markets have evolved significantly in recent last years. In addition to stronger issuanceprograms of inflation-linked debt from governments, derivatives have developed, allowing abroader set of market participants to start trading inflation as a new asset class. These changes callfor modifications of risk management and pricing models. While the real rate framework allowedus to apply the familiar nominal bond techniques on linkers,it does not provide a consistent viewwith inflation derivative markets, and limits our ability toreport inflation risk. We thus introducein detail the concept of Break-Even Inflation and develop associated pricing models. We describevarious adjustments for taking into account indexation mechanisms and seasonality in realizedinflation. The adjusted break-even framework consolidatesviews across financial products andgeography. Inflation risk can now be explicitly defined and monitored as any other risk class.

1 Introduction

Even though inflation is an old topic for academics, interestfrom the financial community only began

recently. This is somewhat due to historical reasons. Inflation was and is a social and macroeconomic

matter, and has consequently been a concern for economists,politicians and policy makers, not for

financial market participants. High inflation (and of coursedeflation) being perceived as a bad signal

for the health of an economy, efforts concentrated on the understanding of the main inflation drivers

rather than on the risks inflation represents, especially for financial markets. One of the most famous

sentences of Milton Friedman, “Inflation is always and everywhere a monetary phenomenon,”

suggests that monetary fluctuations (thus, money markets) are meaningful indicators of inflation risks

we might face in financial markets. While this claim is supported by most economic theories, the

monetary explanation cannot be transposed in the same way onfinancial markets. Persistent shocks

on money markets effectively determine a large fraction of long-term inflation moves, but short and

mid-term fluctuations of inflation-linked assets bear another risk which has to be analyzed in its own

right.

Quantifying inflation risk on financial markets is today a major concern. The markets developed

quickly over the last five to ten years and we expect them to continue to evolve. On the one hand, after

Page 28: Risk Metrics 2008

26 Inflation Risk Across the Board

several years of low inflation across industrialized countries, signals of rising inflation have appeared.

Rising commodity and energy prices are typical examples. Onthe other hand, more and more players

are coming to inflation markets both on the supply and demand sides. Two decades ago, people

considered equities as a good hedge against inflation, but equities appear to be little correlated with

inflation, and demand for pure inflation hedges has dramatically increased. Pension funds and insurers

are the most active and really triggered new attention. Today, they not only face pressure from their

rising liabilities but also from regulators.

Even though inflation is not a new topic for us, we thus need to brush the cobwebs on techniques used

to understand and quantify inflation risk, examining the newperspectives and problems offered by

evolving inflation markets. In this paper, we first explore how inflation is measured. While everybody

can agree on what a rate of interest is just by looking at the evolution of a savings account, inflation

hits different people differently and depends on consumption habits. We highlight seasonality effects

and their impact on the volatility of short-term inflation. We then survey the structure of the

inflation-linked asset class. We are left with the necessityto consider inflation risk in a new way. For

that purpose, we come back to the well-known Irving Fisher paradigm linking real and nominal

economies, and define in a clean way the concept of break-eveninflation. We then describe various

adjustments required by the indexation mechanisms of inflation markets so as to make break-even a

suitable quantity for risk management. Adjusted break-evens allow us to consistently consider and

measure inflation risk across assets and markets. Finally, we illustrate the methodology through actual

data over the last years, considering break-evens, nominaland real yields.

2 Measuring economic inflation

The media relay inflation figures on a regular basis. It is important to understand where numbers

come from and their implications. An inflation rate is measured off a so-called Consumer Price Index

(CPI) which varies significantly across countries and through time. The annual inflation rate is

commonly reported. It constitutes the percentage change inthe index over the prior year. We

underline hereafter that this is a meaningful way to measureeconomic inflation. Exploiting economic

inflation would however be a challenging task for financial markets which require higher frequency

data. The monthly change on most CPIs is more suitable but should be considered with care, taking

into account seasonality in the index.

Page 29: Risk Metrics 2008

Measuring economic inflation 27

2.1 Consumer Price Indices

A consumer price index is a basket containing household expenditures on marketable goods and

services. The total value of this index is usually scaled to 100 for some reference date. As an

example, consider the Harmonized Index of Consumer Prices ex-tobacco (HICPx). This index applies

to the Euro zone, and is defined as a weighted average of local HICPx. The weights across countries

are given by two-year lagged total consumption, and the composition of a local index is revised on a

yearly basis. While the country composition has been generally stable, we notice significant changes

with the entry of Greece in 2001 and with the growth of Ireland. The contribution of Germany has

decreased from 35% to 28%.

The composition of the HICPx across main categories of expenditures has also significantly evolved

over the last ten years.1 We can observe that expenditures for health and insurances have grown by

6%. We would have also expected a large growth in the role of housing expenditures as real estate and

energy take a greater share of the budget today, but this category stayed barely constant (from 16.1%

in 1996 to 15.9% in 2007). This is a known issue, but no corrective action has been taken yet for fear

of a jump in short-term inflation and an increase the volatility of the index.

These considerations point out that inflation as measured through CPI baskets is a dynamic concept,

likely to capture individual exposure to changes in purchasing power with a lag.2 To some extent, the

volatility of market-implied inflation contains the volatility in the definition of the index itself.

2.2 Measure of realized inflation

Beyond the issue of how representative any CPI is, measuringan inflation rate off a CPI raises timing

issues. We define the annualized realized inflation ratei(t,T) from t to T corresponding to an index

CPI(t) as

i(t,T) = T−t

√CPI(T)

CPI(t)−1. (1)

Subtleties in the way information on the index is gathered and disclosed are problematic for financial

markets. First, CPI values are published monthly or quarterly only while markets trade on a daily

basis. Second, a CPI is officially published with a lag, because of the time needed for collecting prices

of its constituents. Finally, the data gathering process implies that a CPI value can hardly be assigned

1Data on the index and its composition can be found on the Eurostat web site epp.eurostat.ec.europa.eu2The most famous consequence of this lag is known as the Boskineffect.

Page 30: Risk Metrics 2008

28 Inflation Risk Across the Board

to a precise datet. An arbitrary date is however settled. By convention, this reference date is the first

day of the month for which the collecting process started.

As an illustration, let us consider the HICPx with the information available on 04 October 2007. The

last published value was 104.19 on 17 September 2007. This value corresponds to the reference date

01 August 2007. The previous value, 104.14, published in August, corresponds to 01 July 2007. On

October 4, we could thus compute the annualized realized inflation rate corresponding—by

convention—to 01 July 2007 through 01 August 2007:i =(

104.19104.14

)12−1' 0.58%, but the inflation

rate between August 1 and October 4 was still unknown.

Such a low inflation figure, 0.58%, sounds unusual. The media commonly report inflation over the

past year. On October 04, we could thus reference inflation104.19102.46−1' 1.69% in the Euro zone,

102.46 being the HICPx value for the reference date 01 August2006. The top graph in Figure 1

shows the evolution of these two inflation measures. The annual inflation rate has been relatively

stable on the Euro zone, certainly as the result of controlled monetary policies. The high volatility in

the monthly inflation rate is striking, with a lot of negativevalues. Of course, we cannot talk about

deflation in June (−0.31%) and inflation in July (0.58%). The repeating peaks and troughs, noticeably

pronounced over the last years, and the nature of some constituents of the CPI, advocate for a

systematic pattern (seasonality) in monthly data which hasto be filtered out.

2.3 Seasonality

Seasonality in demand for commodities and energy is a well established phenomenon. There exist

several methods which extract seasonality from a time series. Typically, the trend is estimated though

autoregressive processes (AR) and seasonal patterns through moving average (MA). We apply here a

proven model developed by the US Bureau of Census for economic time series, known as the X-11

method.3 We set a multiplicative model on the CPI itself such that

CPI(t) = S(t)CPI(t) (2)

whereCPI(t) represents the seasonally adjusted CPI, andS(t) the seasonal pattern.

Returning to monthly estimates of the inflation rate, the topgraph in Figure 1 shows that most of the

volatility in monthly inflation measured off the unadjustedCPI time series is due to seasonality. The

seasonally adjusted monthly inflation rate for July is about1.57% and for June about 1.72%, figures

3In practice, the time series is decomposed into three components: the trend, the seasonality and a residual. The X-11

method consists in a sequential and iterative estimation ofthe trend and of the seasonality by operating moving averages in

time series and in cross section for each month. See for instance (Shishkin, Young, and Musgrave 1967) for further details.

Page 31: Risk Metrics 2008

Measuring economic inflation 29

Figure 1

Realized inflation implied by HICP ex-tobacco

Annualized inflation reported on monthly and annual CPI variations, with and without seasonality

adjustments. Seasonality componentS(t) estimated using the X-11 method.

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007

−5

0

5

10

Ann

ualiz

ed in

flatio

n ra

te (

%)

MonthlyMonthly − AdjustedAnnual Inflation

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007

99.4

99.6

99.8

100

100.2

100.4

Sea

sona

l com

pone

nt (

%)

in line with the annual estimate. This proves that seasonality has to be an important ingredient when

measuring inflation risk on financial markets, exactly as it is for commodities.

The bottom graph of Figure 1 presents the evolution of the seasonal pattern. The relative impact of

seasonality on the HICP ex-tobacco has doubled over the lastten years. The effect is more

pronounced in winter. Even though the pattern is different,it is interesting to notice that the same

comments apply to the US CPI (and actually for all developed countries). We will show in Section 5

that modeling a stochastic seasonal pattern is pointless.

Page 32: Risk Metrics 2008

30 Inflation Risk Across the Board

3 Structure of the market

Inflation-linked financial instruments are now establishedas an asset class, but the growth of this

market4 is a new phenomenon mainly driven by an increased demand. Conventional wisdom used to

be that inflation-linked liabilities could be hedged with equities, producing a low demand for pure

inflation products. The past ten years have demonstrated thedisconnection between inflation figures

and the evolution of equity markets, and highlighted the necessity to develop inflation markets.

We examined various measures of correlation between the main equity index and the corresponding

CPI in the US, the UK and in France.5 Our results were between -20% and 40%, concentrated around

0% to 20%. An institution with inflation-indexed liabilities could not be satisfied with even the best

case, a 40% hedging ratio. Demand for inflation products has thus exploded, reinforced by new

regulations (the International Financial Reporting Standards, IFRS, and Solvency II in Europe).

Restructuring activities from banks have also created a liquid derivatives market. This evolution of

inflation markets has shifted the way inflation-linked products and inflation risk have to be assessed.

While in the early stages, the analogy with nominal bonds wasconvenient for the treatment of

inflation-linked bonds, this is no longer true with derivatives.

3.1 The inflation-linked bond market

Governments structured inflation-linked bonds (ILB) and created these markets as alternative issuance

programs to standard treasury notes. This supply is mainly driven from cost of funding concerns. The

risk premium that governments should grant to investors on ILB should be slightly lower than on

nominal bonds. In addition, governments have risk diversification incentives on their debt portfolio.

Both outgoing flows (such as wages) and incoming flows (such astaxes) are directly linked to

inflation. As a consequence, some proportion of ILB should beissued; importantly for the

development of these markets, governments committed to provide consistent and transparent issuance

plans several years ahead.

Inflation-linked bonds are typically bonds with coupons andprincipal indexed to inflation. The

coupon is fixed in real terms and inflated by the realized inflation from the issuance of the bond (or

4Bloomberg reports a tenfold increase in the HICPx-linked market, to overC100 million outstanding, since 2001.5We considered monthly versus yearly returns and realized inflation, using expanding, rolling and non-overlapping

windows. Rolling windows show a high volatility in the correlation itself, though the statistical significance of theseresults

is pretty low. Introducing a lag does not modify the outputs.

Page 33: Risk Metrics 2008

Structure of the market 31

Figure 2

Cash flow indexation mechanism

Inflation protected period for holding a cash flow fromt to T. L stands for the indexation lag, andtl for

the last reference date for which the CPI value has been published.

from a base CPI value).6 The indexation mechanism is designed to overcome the problems associated

with the CPI, namely its publication frequency and its publication lag. This is achieved by introducing

a so-called indexation lagL.7 Typical lag periods range from two to eight months.

Flows are indexed looking at the CPI value corresponding to areference dateL months into the past.

With such a scheme, the inflation protected period and the holding period do not exactly match, as

illustrated in Figure 2. The nominal couponCN to be paid at timeT to an investor who acquired this

right at timet on a real couponCR is then given byCN = CRCPI(T−L)CPI(t−L) . The lag is set such that the base

CPI valueCPI(t−L) is known at timet, provided that an interpolation method is also specified to

compute this value from the CPI values of the closest reference dates. This mechanism is mandatory

for calculations of accrued interests.

The approach yields to a simple pricing formula—Section 4.2justifies it—in terms of real discount

ratesRR(t, .). Denotingt1, ..., tN the coupon payment dates andCPI0 the value of the reference CPI at

the issuance of the bond (typically,CPI0 = CPI(t0−L) wheret0 is the issuance date), we get

PV(t) =CPI(t−L)

CPI0

(

∑i;t<ti

CR

(1+RR(t, ti))ti−t +100

(1+RR(t, tN))tN−t

). (3)

This equation implies that in real terms, an ILB can be considered as a plain vanilla nominal bond

6This structure is now standard, though other structures with capital only or coupons only indexation can still be found.7Depending on the context,L should be understood as a number of months or as a fraction of year.

Page 34: Risk Metrics 2008

32 Inflation Risk Across the Board

with fixed couponCR, but substituting real rates for nominal rates. (In fact, this equation effectively

defines the real rates.) Risk is represented by the daily variations in real rates—or real yields—since

the real rate curve is the only unknown from today to tomorrow. Surprisingly, we cannot isolate

inflation risk here even though nominal rates, real rates andinflation risks are connected quantities.

The standard convention is to quote ILB prices in real terms,that is, to quote the real coupon bond

which then must be inflated by the above index ratio. For instance, on December 3, the French OATe

2.25 7/25/2020 linked to the HICPx is quoted with a real cleanprice of 103.177. Using the

appropriate index ratio of 1.0894 leads to a clean PV ofC112.401.8

3.2 Development of derivatives

Liability hedging sparked off this boom. Pension funds are largely exposed to inflation moves:

indirectly, when retirement pensions are partially linkedto inflation through indexation to final wages,

directly, when pensions are explicitly linked to a CPI.9

Inflation swap markets have been developed to answer more precisely the needs from liability driven

investments. Stripping ILB into zero-coupon bonds, financial intermediaries are now able to propose

customized inflation-protected cashflow schedules. In its simplest form, a zero-coupon inflation swap

is written on the inflation rate from a given CPI. The inflationpayer pays at the maturityT of the

contract the increase in the indexCPI(T−L)CPI(t−L) times a predefined notional. The protection buyer pays a

fixed rate on the same notional. The fixed rate is determined atthe inception such that the value of the

swap is null.10 With inflation swaps—and similar derivatives—this fixed rate is the quantity at risk,

and expresses mainly views on expected inflation. By convention, inflation swaps are quoted through

this fixed rate for a set of full-year maturities. Thus, the natural risk factors for inflation swaps are the

quoted swap rates, rather than the real rates from before. Inilliquid markets, we could accept different

risk factor definitions, but today, with liquid markets and significant intermediation activity, we

require a consistent view. Break-even inflation as defined later in this paper fills the gap while

accounting for the intrinsic features of consumer price indices.

8This ratio is computed taking into account a three-month lag, a linear interpolation method and a three days settlement

period. The HICPx values can be downloaded from the Eurostatweb site.9The tendency in Europe is to the direct linkage of pensions. Previously, most pensions were indirectly linked through

the average wage over the last years before retirement.10Such a structure is very effective for liability hedging: the whole capital can still be invested in risky assets while ILB

lock money in low returns. Among the possible swaps, one would prefer year-on-year inflation swaps which pays inflation

yearly. It strongly limits the amount of cash which has to be paid at maturity.

Page 35: Risk Metrics 2008

Structure of the market 33

Figure 3

Inflation Markets

3.3 More players with different incentives

Figure 3 depicts the global structure of inflation markets. Pension funds, insurers, corporates and

retail banks look both for protection and portfolio diversification. Banks stand across the board, as

intermediaries competing with hedge funds on inflation arbitrages, and as buyers looking for money

market investments and diversification. On the supply side,utilities and large corporates issue

inflation swaps or ILB so as to reduce their cost of funding. The inflation swap market has indeed

consistently granted a small premium (0-50bps) to inflationpayers. In addition to standard

factors—credit and liquidity risk—this premium contains restructuring fees and a reward for a small

portion of inflation risk which cannot be transferred from bond markets to swap markets because of

indexation rules.

Page 36: Risk Metrics 2008

34 Inflation Risk Across the Board

4 The concept of break-even inflation

Ideally, we would like to define and use expected inflation as aconsistent risk factor on both markets.

Given that inflation-linked assets are written in terms of CPI ratios, we might believe that expected

inflation can be extracted from inflation swaps quotes, or from ILB prices. Unfortunately, measuring

expected inflation is challenging and cannot be done withoutan explicit dependency on the nominal

interest rate curve. Even though prices of inflation products can be observed, we show that there are

no model independent inflation expectations which can be derived.

We previously defined the realized inflation measurei(t,T) in (1) and can thus define the expected

inflation I(t,T) as

I(t,T) = Et

[T−t

√CPI(T)

CPI(t)

]−1, (4)

where the expectation is taken under the physical measure. Given that future inflation is uncertain, a

premium is necessarily embedded into the above expectation.11 But making an assumption about this

premium is not sufficient either. Using standard concepts ofasset pricing theory, we demonstrate

hereafter that one can extract forward CPI values—or expectations of CPI values under corresponding

nominal forward measures—only. These forward CPI values are the main inputs entering into the

break-even inflation concept.

From now on, we leave aside the annualization, consider perfect indexation (L = 0) and assume that

the realized CPI is observable on a daily basis. We remark that we would be able to observe the

expected inflation if we could observe the expected CPI:

Et

[CPI(T)

CPI(t)

]=

1CPI(t)

Et [CPI(T)] . (5)

4.1 The exchange rate analogy

Let us first call the nominal world our physical “home” world:in this world, any good or amount is

expressed in a monetary unit which is a currency (say, $US). We can consider other worlds that we

would still describe as nominal worlds but in which the monetary unit is another currency (say,C),

the “foreign” worlds. In a complete market with absence of arbitrage opportunities, a unique measure

exists to value all goods and assets in our nominal “home” world, the risk neutral measure. Through a

11There is no consensus in the academic literature about the sign and the magnitude of this risk premium. In the US,

recent papers evaluated the premium up to 50bp while the European premium might be insignificant. See for instance

(Buraschi and Jiltsov 2005) and (Hordahl and Tristani 2007).

Page 37: Risk Metrics 2008

The concept of break-even inflation 35

change of numeraire, typically given by the exchange rate dynamics, pricing can be done in any

currency using the risk neutral measure in the “foreign” world.

It is common to consider inflation analogously to this setup by defining a CPI basket as a new

monetary unit. We refer to this as a basket unit as opposed to adollar unit. The world where all goods

and amounts are expressed in basket units is the real world. Because of the completeness argument,

through a change of numeraire pricing of real assets can be done in the real economy or in the

nominal economy equivalently. The change of numeraire is given by the CPI price itself. As with

exchange rates,CPI(t) is the spot exchange rate to convert one basket unit in the real economy to the

nominal economy, that is, to $.

4.2 Standard pricing of linkers

As we discussed, linkers are traditionally priced in the real world through real coupon bonds. Let us

consider the simplest linker, a perfectly indexed inflation-linked discount bond (ILDB)—with price

P(t,T; t) $—issued at timet, which gives the right to receive a cash flow ofCPI(T)CPI(t) at T. A linker can

obviously be decomposed into a deterministic linear combination of ILDB matching the coupon

payment dates. We further introduce the discount bondsBN,N(t,T) andBR,R(t,T) at datet with

maturity dateT in respectively the nominal world and the real world and in their own monetary units.

In other words, these bond prices are obtained under the riskneutral measuresP∗N andP∗

R of the

nominal and real worlds respectively:

Bx,x(t,T) = EP∗x

t

exp

T∫

t

rx(s)ds

, (6)

whererx(s) is the short rate in the corresponding world at times. We can express the price of this real

discount bond in nominal termsBN,R(t,T) using the spot CPI asBN,R(t,T) = CPI(t)BR,R(t,T). This

does not correspond to the price of an investment paying in our “home” world since it settles to one

CPI unit at timeT: clearly, 1CPI(t)BN,R(t,T) 6= P(t,T; t) as they do not provide the payoff in the same

units. This is illustrated Figure 4 through the black and thered arrows.

Though pricing of ILDB can be done using expectations on future CPIs (this is developed in the next

section), a straightforward observation yields to (3) and apricing model in terms of real rates. At time

t0, payingP(t0,T; t0) $ for one ILDB issued att0 and maturing atT yields to a payoff ofCPI(T)CPI(t0)

$ at

maturity. The same payoff can be locked by investingBR,R(t0,T) $ in the real world—thus paying1

CPI(t0)BR,R(t0,T) real unit for 1

CPI(t0)real discount bond maturing atT—and converting the payoff

back into dollars at maturity. This is materialized in Figure 5, where we explicitly mark dollar flows.

Page 38: Risk Metrics 2008

36 Inflation Risk Across the Board

Figure 4

ILDB and real discount bond prices in the nominal world

Figure 5

Investment flows for replicating an ILDB

Page 39: Risk Metrics 2008

The concept of break-even inflation 37

Since this is a self-financing replicating strategy which yield to the same payoff in all states of nature

atT, in absence of arbitrage opportunities, the price of the ILDB at timet such thatt0 ≤ t ≤ T is equal

to the value of the replicating strategy, and thus given by

P(t,T; t0) =CPI(t)CPI(t0)

BR,R(t,T) =1

CPI(t0)BN,R(t,T) $. (7)

Do note the following implication: at issuance, the dollar value of the ILDB is exactly equal to the

real discount bond price, as if a CPI unit was worth $1. This justifies (3), and this simple pricing

model advocated for the use of real rates as risk factors.

4.3 Expectations of future CPI values

4.3.1 The forward CPI value

Let us derive the forward CPI value which will be the buildingblock of break-evens. Since we have

ultimately to price instruments in the nominal world—this is our “home” world—we consider the

forward priceFN,R(t,T1,T2) of a real discount bond delivered atT1 and maturing atT2 expressed in

nominal terms. Using the exchange rate analogy, we define theforward CPI valueFCPI(t,T) at t for

delivery dateT as the forward price whenT1 = T2 = T: FCPI(t,T) = FN,R(t,T,T). In absence of

arbitrage opportunities (AOA), we obtain

FN,R(t,T1,T2) =CPI(t)BR,R(t,T2)

BN,N(t,T1). (8)

Notice that given a set of nominal discount bond prices and a set of real discount bond prices in their

respective units, the values of forward CPI are known.

4.3.2 Link with CPI expectations

In our complete market, the AOA condition implies that at time t, the following condition should hold:

BN,N(t,T) = EP∗

Rt

CPI(t)CPI(T)

exp

−T∫

t

rR(s)ds

, (9)

since investing in the nominal world should be equivalent toinvesting in the real world with an initial

nominal amount and converting the output back to the nominalworld (see Figure 6). The standard

relationship between risk neutralP∗ and forward measuresPT applied to the real world leads to

BN,N(t,T) = CPI(t)BR,R(t,T)EPT

Rt

[1

CPI(T)

]. (10)

Page 40: Risk Metrics 2008

38 Inflation Risk Across the Board

Figure 6

Non-arbitrage condition between real and nominal worlds

Combining (8) and (10) implies that

1FCPI(t,T)

= EPT

Rt

[1

CPI(T)

]. (11)

Considering the AOA condition from the real world leads to a similar equation,

BR,R(t,T) = EP∗

Nt

CPI(T)

CPI(t)exp

T∫

t

rN(s)ds

⇒ FCPI(t,T) = E

PTN

t [CPI(T)] . (12)

This shows that the expected CPI valueEt [CPI(T)] cannot be directly observed. Of course, assuming

the dynamics for the CPI itself, its interactions with nominal rates and the shape of the inflation risk

premium, we could derive the exact relationship between theforward CPI and the expected CPI. In

particular, if the CPI dynamics and nominal rate dynamics were independentFCPI(t,T) would be

equal to the expected CPI under the nominal risk neutral measure.

4.4 The break-even inflation

The above equations actually define the quantity that we can extract—free of modeling bias—which

we refer to as the zero-coupon break-even inflation (BEI). Indiscrete annual compounding, it can be

characterized by

BEI(t,T) = T−t

√FCPI(t,T)

CPI(t)−1, (13)

or in continuous compounding,

BEIc(t,T) =1

T − tlog

(FCPI(t,T)

CPI(t)

), (14)

Page 41: Risk Metrics 2008

The concept of break-even inflation 39

Notice that because of (8) and (10), the definition (13) is equivalent to

BEI(t,T) =1+RN(t,T)

1+RR(t,T)−1, (15)

whereRR(t, .) andRN(t, .) are respectively the real and the nominal zero-coupon rates.

Equation (15) is the well known Fisher equation, with break-evens substituted for expected inflation.

However, with stochastic inflation (equivalently, future CPI values) the relationship is verified by

break-evens only. Equation (10) provides more insight on the components included in the break-even

inflation: the complete Fisher equation involving the expected inflationI(t,T) can be written as12

(1+RN(t,T)) = (1+RR(t,T))(1+BEI(t,T)) = (1+RR(t,T))(1+ I(t,T))(1+π(t,T))(1+v(t,T))

(16)

in which interest rates and CPI dynamics provide expressions for π(t,T)—which contains the

inflation risk premium—andv(t,T)—the correlation correction between interest rates and theCPI

dynamics, with a convexity adjustment, depending on the model.

4.5 Pricing in the nominal world only

Using a zero-coupon break-even curve and a nominal zero-coupon interest rate curve, we can now

rely on a new pricing model for linkers. This new framework enables the explicit modeling of

inflation risk and standard interest rate risk. An inflation-linked bond can indeed be modeled as a

stochastic coupon bond. The priceP(t,T; t0,CR) at timet of a linker maturing atT, issued att0 with a

real coupon ofCR can indeed be computed under the risk neutral measureP∗N through

P(t,T; t0,CR) = EP∗

Nt

[

∑i;t<ti

CPI(ti)CPI(t0)

CR exp

ti∫

t

rN(s)ds

+

CPI(T)

CPI(t0)100exp

T∫

t

rN(s)ds

]

(17)

=CPI(t)CPI(t0)

(

∑i;t<ti

EP

tiN

t

[CPI(ti)CPI(t)

]CRBN,N(t, ti)+E

PTN

t

[CPI(T)

CPI(t)

]100BN,N(t,T)

)(18)

=CPI(t)CPI(t0)

(

∑i;t<ti

CR

(1+BEI(t, ti)1+RN(t, ti)

)ti−t

+100

(1+BEI(t,T)

1+RN(t,T)

)T−t)

(19)

wherePsN is the forward measure for the times in the nominal world. Similarly, one can show that the

fixed rate of an inflation swap is equal to the break-even inflation for the same horizon.

12This can be derived from Ito’s lemma applied to the dynamicsof the CPI, and nominal or real rates.

Page 42: Risk Metrics 2008

40 Inflation Risk Across the Board

5 Adjusted break-evens as risk factors

Importantly, the methodology presented above does not depend on perfect indexation. The indexation

lag and the publication lag can be taken into account throughslight adjustments in the definition of

the break-even inflation and in pricing formulas. We now detail those adjustments, beginning with

seasonality. We then discuss two types of break-evens that can be used in a risk context. The second

type of BEI is motivated by requirements of homogeneity and portability, two essential characteristics

for a consistent evaluation of inflation risk across both asset classes and countries.

5.1 Including seasonality

We showed in Section 2.3 that the predictable seasonal pattern should be stripped out from observed

values of the CPI. This seasonality component should be included into our modeling of break-evens

by defining seasonally adjusted break-evens. Assuming thatthe seasonal pattern is deterministic, we

can combine (2), (11) and (13), defining aseasonally adjusted break-evenBEI(t,T) as

(1+BEI(t,T))T−t =S(t,T)

S(t, t)(1+BEI(t,T))T−t (20)

whereS(t,T) is the seasonality estimated att projected forT. The projection is done by repeating the

last whole year’s seasonal patternS(t), as extracted in Section 2.3. Between estimated seasonal

monthly valuesS(t), a linear interpolation is performed.

Since inflation swap quotes constitute (a special type of) break-evens, we explore the impact of

seasonality on the strike rate of an inflation swap. Figure 7 presents results on the EU HICPx and US

CPI inflation. We will refer to these BEI as adjusted and unadjusted standard break-evens. In the

figure, we observe that US inflation was expected to increase,while the European curve was flat. US

inflation swaps traded about 25bp to 50bp above European inflation swaps. Further, the impact of

seasonality vanishes quickly with increasing swap maturity. Since the seasonal pattern changes

slowly,13 any implication on mid- and long-term unadjusted BEI would be insignificant. Short-term

unadjusted BEI are well predicted by the seasonal pattern observed on the previous year. Figure 7

underlines the benefits of using adjusted BEI as risk factors. Because of the swings created by the

seasonality, we would overestimate short-term inflation risk by considering unadjusted BEI. This will

be shown later. Unless specified, we consider seasonally adjusted BEI in the remainder of this paper.

13Recall Figure 1.

Page 43: Risk Metrics 2008

Adjusted break-evens as risk factors 41

Figure 7

Impact of seasonality on inflation swap quotes

Inflation swap quotes are interpolated using smoothed splines. Data as of 03 December 2007

0 5 10 15 20 25 302

2.5

3

3.5

Maturity

US

Bre

ak−

even

infla

tion

swap

(%

)

AdjustedUnadjustedQuotes

0 5 10 15 20 25 302

2.5

3

3.5

Maturity

EU

Bre

ak−

even

infla

tion

swap

(%

)

AdjustedUnadjustedQuotes

5.2 Homogeneity and portability

The validation of a pricing model for risk management purposes imposes some constraints on the

input data, or equivalently on the risk factors entering into equations. When introducing imperfect

indexation, definition (13) has to be adapted for our break-even to satisfy those constraints. First, our

ability to estimate and model the distribution of a risk factor depends on our capacity to observe an

homogenous sample of this factor through time. Homogeneitycomes into two flavors: the observable

should refer to the same theoretical quantity—in particular, to the same horizon—and it should be

forward looking. Second, risk analyses at the portfolio level call for a consistent modeling of the same

risk across markets and assets. As far as inflation is concerned, we would like inflation swaps and

inflation-linked bonds to rely on the same risk factor. Trading on the two asset classes—for instance

through inflation-linked swaps—might even require interchangeability in break-evens derived from

inflation swaps and from linkers.14 This involves disentangling specific market conventions from the

risk factors themselves.

14In general, an inflation-linked swap consists in swapping a linker against a floating nominal leg.

Page 44: Risk Metrics 2008

42 Inflation Risk Across the Board

5.2.1 The standard BEI

We characterize a first type of break-even by coming back to ILDB. Relaxing the perfect indexation

assumption implies that the payoff of an inflation linked discount bond refers to lagged inflation. Its

price is given by

P(t,T; t0,L) = EP∗

Nt

CPI(T−L)

CPI(t0−L)exp

T∫

t

rN(s)ds

(21)

=CPI(t−L)

CPI(t0−L)E

P∗N

t

CPI(T −L)

CPI(t−L)exp

T∫

t

rN(s)ds

(22)

=CPI(t−L)

CPI(t0−L)E

PTN

t

[CPI(T −L)

CPI(t−L)

]BN,N(t,T). (23)

Equation (23) shows that we can embed the indexation lag within the definition of the break-even

inflation by setting

(1+BEI(t, t−L,T −L))T−t =FCPI(t,T−L)

CPI(t−L)=

S(t,T−L)

S(t, t−L)(1+BEI(t, t−L,T −L))T−t . (24)

This definition conveniently allows us to stick to (19) without additional effort beyond modifying the

upfront index ratio. This is a standard market practice. We thus refer to (24) as thestandard BEI.

From a risk perspective, seasonally adjusted standard break-evens do satisfy some of the

aforementioned criteria on risk factors. We can indeed derive a standard zero-coupon BEI curve on a

daily basis from inflation swap quotes as well as from treasuries and inflation-indexed bonds.

However, this simple approach comes at the cost of deriving aquantity which is not purely at risk.

Figure 8 shows how the protected period—and so the BEI—decomposes. We distinguish three parts.

From the base indexation datet −L up to the last observed reference datetl , the inflation rate is

actually known. In betweentl and the analysis datet, if not fully known the inflation rate is strongly

predictable. The third, forward-looking part is the true period at risk.

Portability of standard BEI is also a concern. Conventions for computing the base index value

CPI(t−L) varies from one market to another, and leaving this component within the break-even

significantly restricts the way it can be used. For instance,French linkers indexed to the HICPx (the

OATe) obtain their base index value by interpolation between two prior observed CPI values, while

HICPx inflation swaps choose the CPI value for a single reference date. Because of these different

conventions, the discrepancy between standard BEI derivedfrom inflation swaps and from linkers

varies in a predictable way through a month, which is unsatisfactory. The HICPx interpolation rules

additionally create spurious jumps when the protected period changes.

Page 45: Risk Metrics 2008

Adjusted break-evens as risk factors 43

Figure 8

Standard and fully adjusted break-evens

Zero coupon break-even inflation defined off a position held from t to T. L stands for the indexation

lag, andtl for the reference date at which the last CPI value was published.

5.2.2 Fully adjusted BEI

Acknowledging these defects in the standard BEI, another common practice is to define “forward”

break-evensBEIf (t,T,L) as the risk factors by stripping the first two parts of the protected period:

(1+BEIf (t,T,L))T−t =(1+BEI(t, t−L,T))T−t+L

(1+BEI(t, t−L, t))L =S(t,T)

S(t, t)(1+BEI f (t,T,L))T−t . (25)

We underline that the “forward” label is a misuse of languagesinceBEIf (t,T,L) is nothing more than

a spot break-even measure. The forward standard BEI is thus the quantity that we have to use when

applying the Fisher equation (15). More importantly, it is obvious that none of the standard BEI

drawbacks are corrected.15

We can nevertheless define another type of break-even. Coming back to Figure 8, we could partially

adjust for the indexation lag and consider a break-even overthe non-deterministic protected period

from tl to T −L. This would break the homogeneity condition since on a dailybasis for a quoted

constant time-to-maturity instrument (such as an inflationswap), the interval[tl ,T −L] increases

slowly before contracting again when a new CPI value is published fortl+1. Homogeneity of the risk

factor can be satisfied by defining a break-even on the third period only as shown on Figure 8,

(1+BEI(t,T,L))T−t−L =FCPI(t,T−L)

CPI(t)=

S(t,T−L)

S(t, t)(1+BEI(t,T,L))T−t−L. (26)

15The defects could be corrected if we could observe the price of an overnight inflation swap or a one-day linker.

Page 46: Risk Metrics 2008

44 Inflation Risk Across the Board

This break-even cannot be observed since the current CPI is unknown att. Because of the high

predictability ofCPI(t) we can definefully adjusted break-eventhrough a forecastCPI∗(t) as

(1+BEI(t,T,L))T−t−L =FCPI(t,T−L)

CPI∗(t). (27)

Various strategies can be applied so as to obtainCPI∗(t). Notice that the forecasting model can be

designed under the physical or the risk neutral probabilitymeasure since the risk premium should be

barely null for such a small horizon.16 Regression-based strategies on log CPI could be operated.17 A

conservative approach can also be designed, for instance, by assuming that the past year’s inflation is

the best forecast of the increase in the CPI from the last published value. We apply this strategy in the

following figures and data.

5.2.3 The adjustments in practice

We highlight the differences in the standard and the fully adjusted break-evens by looking at the

European (HICPx) and the US inflation swap markets. Figure 9 presents the term structure of adjusted

break-evens extracted from market quotes. As expected, differences are significant on the short-term.

On the one hand, the standard BEI smoothes out expectations of future inflation by realized inflation

on the indexation lag period. When realized inflation has been higher than expected, it tends to lower

the true expectations of future inflation, and vice versa. For instance, given that the indexation lag on

the US inflation swaps is three months, the US standard curve is influenced by the realized inflation

over August. The August inflation was high 3.36% and is likelyto bias the standard BEI curve which

displays a short term value of 2.38%. On the other hand, fullyadjusted BEI can suffer from bad

predictions of the current index valueCPI∗(t). Our conservative forecasting strategy slightly

underestimated the August to November inflation with an estimated CPI value of 2.5%, while ex-post

we observed a 3.03% inflation rate. We nevertheless observe here that the implied short term

break-even—about 2.92%—is more inline with the last inflation realizations. Put simply, fully

adjusted break-evens react more quickly to realized inflation.18

Figure 10 shows implications of the various adjustments on volatility estimates. Most inflation swap

markets offer a decreasing volatility with the break-even horizon, though the US market is an

16With the development of inflation markets, liquid futures oninflation could for instance be used.17Prices of futures on energy and commodities, money market rates, etc...are potential forecasting variables.18Let us point out that even though we do not provide details here, several subtle issues were taken into account. First,

the market conventions for the indexation method differ between the European and US markets. Both markets use a three-

month lag but the US market uses a linear interpolation whilethe European swaps are indexed between index reference

dates directly. Second, data and break-evens have to be interpolated. This has been done through smoothed splines applied

on seasonally adjusted break-evens and forward break-evens.

Page 47: Risk Metrics 2008

Adjusted break-evens as risk factors 45

Figure 9

Adjusted break-even term structures

Forward standard BEI and fully adjusted BEI term structuresfrom the HICPx and the US CPI inflation

swaps. All curves are seasonally adjusted. Data as of 05 November 2007

5 10 15 20 25 302.2

2.3

2.4

2.5

2.6

EU

BE

I Sw

ap (

%)

5 10 15 20 25 302.2

2.4

2.6

2.8

3

3.2

US

BE

I Sw

ap (

%)

Standard SAFully Adj. SA

Standard SAFully Adj. SA

exception. Seasonality effects are presented on the standard adjusted curves only. It exhibits the

benefits of adjusting break-evens for seasonality by removing meaningless waves. From previous

comments about differences between the two adjusted BEI, wecould expect that the standard BEI

methodology tends to underestimate short-term volatility. This is confirmed in Figure 10.

We further check the adequacy of the break-evens with classic assumptions of risk models. It is for

instance standard to assume that risk factors follow a normal—log-normal for a positive

variable—distribution or at-distribution. Looking at the evolution of standard break-evens especially

casts doubt on such an assumption (see Figure 11). The data displays different regimes with several

jumps. We performed a Jarque-Bera statistics, an adequacy test to the normal distribution. For

standard BEI, the null hypothesis of adequacy to the normal distribution is rejected at a 5%

confidence level for all maturities up to ten years. On the contrary, for the fully adjusted break-even

the same test cannot reject the null hypothesis for all maturities above two years. Given that there is

only one market point below—the one year point—this can be interpreted as a good signal and

advocates for the use of fully adjusted BEI.

Page 48: Risk Metrics 2008

46 Inflation Risk Across the Board

Figure 10

Volatility (annualized) across the term structure of BEI from HICPx inflation swaps

Volatility computed using a decay factor of 0.94. “U” indicates a non-seasonally adjusted curve; “SA”

stands for seasonally adjusted. Data as of 05 November 2007

1 2 3 4 5 6 7 8 9 100

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

EU

BE

I Sw

ap V

ol (

%)

Standard SAStandard UFully Adj. SA

Figure 11

Short- and mid-term BEI from HICPx inflation swaps

02/06 04/06 06/06 08/06 10/06 12/06 02/07 04/07 06/07 08/07 10/071

1.5

2

2.5

3

3.5

1−ye

ar E

U B

EI S

wap

(%

)

02/06 04/06 06/06 08/06 10/06 12/06 02/07 04/07 06/07 08/07 10/071.8

2

2.2

2.4

2.6

5−ye

ar E

U B

EI S

wap

(%

)

Standard SAFully Adj. SA

Standard SAFully Adj. SA

Page 49: Risk Metrics 2008

Adjusted break-evens as risk factors 47

Figure 12

Inflation swaps versus ILB standard BEI term structures

Forward standard BEI term structures from the HICPx and the US CPI inflation swaps. All data

seasonally adjusted. European curve derived using French OATe only. Data as of 05 November 2007

5 10 15 20 25 30 35 402.1

2.2

2.3

2.4

2.5

2.6

EU

BE

I (%

)

ILB Standard SASwap Standard SA

5 10 15 20 25 302.2

2.4

2.6

2.8

3

3.2

US

BE

I (%

)

ILB Standard SASwap Standard SA

5.3 ILB, inflation swaps and nominal rates

Up to now we presented outputs from inflation swaps only. Deriving break-even data from linkers is

more challenging. Interpolation issues have to be handled with care since linkers are mostly

coupon-paying bonds. Interactions with the nominal treasury curve and the way it is constructed

magnify the problems that a classic bootstrap procedure generates.19 Figure 12 compares the US and

European term structures implied by inflation swaps and inflation-linked bonds. The inflation-linked

bond markets typically offer long maturity instruments—upto 40 years for the European market—as

issuers target potential buyers of long-term liability hedges. Inflation swap markets are more active on

the short-term and mid-term so as to open inflation trading toother market participants. In addition,

19The BEI curves derived from linkers and treasury bonds presented here have been obtained through an optimization

algorithm. While methodological details are not the purpose of this paper, we underline that the whole methodology is

available in the RiskMetrics Group Technical Notes.

Page 50: Risk Metrics 2008

48 Inflation Risk Across the Board

the structures that financial intermediaries need to createfor issuing inflation swaps involve

short-term nominal interest rates. As discussed earlier, we can observe that inflation swaps trade

above inflation implied by linkers. On the liquid European market, this premium tightened and is now

contained within a 30bp range. The premium seems to be higheron the short-term but this has to be

balanced with the fact that the linkers do not contain information: the smallest available maturity on

the French OATe market is about five years. The premium in the US is higher at about 50bp but on the

short-term which presents an anomaly with a negative premium. Preliminary analysis suggests that

the premium is strongly volatile, and that a basis risk couldbe monitored.

6 Conclusion

From many years fixed income traders used to think in terms of break-evens by applying the famous

Fisher relationshipBEI = 1+YN1+YR

−1≈YN −YR on the yields-to-maturity of adjacent government

nominal bonds and linkers. Doing so, they relied on the real rate pricing framework which allows us

to treat linkers in the same fashion as classic bonds. Setting theoretical foundations on the concept of

break-even inflation, we can move away from the real rate framework and define pricing models

depending on nominal and break-even rates only. Our investigations suggest that fully adjusted

break-evens are particularly adapted in this context. We can then decompose any portfolio containing

inflation-linked assets into nominal interest risk and inflation risk.

Of course, the Fisher relationship still applies over the whole term structure of interest rates. It tells us

that over the last years, nominal rates have mainly been driven by fluctuations in real rates while

break-evens remained stable: see Figure 13. However, whileeconomists fear rising inflation, we are

now in a much better situation to identify changes in break-evens and hedge our inflation exposures.

References

Buraschi, A. and A. Jiltsov (2005). Inflation risk premia andthe expectations hypothesis.Journal

of Financial Economics 75, 429–490.

Hordahl, P. and O. Tristani (2007). Inflation risk premia inthe term structure of interest rates.BIS

Working Papers(228).

Shishkin, J., A. H. Young, and J. C. Musgrave (1967). The X-11variant of the census method II

seasonal adjustment program. Technical Paper No. 15, U.S. Department of Commerce, Bureau

of Economic Analysis.

Page 51: Risk Metrics 2008

Conclusion 49

Figure 13

Decomposition of Euro nominal swaps

Decomposition of the ten-year nominal swap rate through theFisher equation into a ten-year real rate

and zero-coupon break-even inflation.

12/06 02/07 04/07 06/07 08/07 10/071.5

2

2.5

3

3.5

4

4.5

5

EU

10

yeas

r R

ate

(%)

NominalBEI Standard SAReal SA

Page 52: Risk Metrics 2008

50 Inflation Risk Across the Board

Page 53: Risk Metrics 2008

Extensions of the Merger Arbitrage Risk Model

Stephane DaulRiskMetrics Group

[email protected]

A traditional VaR approach is not suitable to assess the riskof merger arbitrage hedge funds. Werecently proposed a simple two- or three-state model that captures the risk characteristics of thedeals in which merger arbitrage funds invest. Here, we refinethe model, and demonstrate that itcaptures merger and acquisition risk characteristics using over 4000 historical deals. We thenmeasure the risk of a realistic sample portfolio. The risk measures that we obtain are consistentwith those of actual hedge funds. Finally, we present a statistical model for the probability ofsuccess and show that we beat the market in an out-of-sample study, suggesting that there is apotential “alpha” for merger arbitrage hedge funds.

1 Introduction

The merger arbitrage strategy consists of capturing the spread between the market and bid prices that

occurs when a merger or acquisition is announced. There are two main types of mergers: cash

mergers and stock mergers. In a cash merger, the acquirer offers to exchange cash for the target

company’s equity. In a stock merger, the acquirer offers itscommon stock to the target in lieu of cash.

Let us consider a cash merger in more detail. Company A decides to acquire Company B, for example

for a vertical synergy (B is a supplier of A). Company A announces that they offer a given price for

each share of B. The price of stock B will immediately jump to (almost) that level. However, the

transaction typically will not be effective for a number of months, as it is subject to regulator

clearance, shareholder approval, and other matters. During the interim, the stock price of B actually

trades at a discount with respect to the offer price, since their is a risk that the deal fails. Usually, the

discount decreases as the effective date approaches and vanishes at the effective date.

In a stock merger, company A offers to exchange a fixed number of its shares for each share of B. The

stock price of B trades at a discount with respect to the shareprice of A (rescaled by the exchange

ratio) as long as the deal is not closed.

With a cash merger, the arbitrageur simply buys the target company’s stock. As mentioned above, the

target’s stock sells at a discount to the payment promised, and profits can be made by buying the

Page 54: Risk Metrics 2008

52 Extensions of the Merger Arbitrage Risk Model

Figure 1

Cash deals. Share price of target (thick line) and bid offer (dotted line)

31−May−05 31−Jun−05 31−Jul−05 31−Aug−05 30−Sep−05 31−Oct−0534

36

38

40

42

44

46

Date

Sha

re P

rice

LabOne Inc.

9−Jun−05 31−Jun−05 31−Jul−05 31−Aug−05 30−Sep−058

8.5

9

9.5

10

10.5

11

11.5

12

12.5

13

Date

Sha

re P

rice

InfoUSA Inc.

target’s stock and holding it until merger consummation. Atthat time, the arbitrageur sells the target’s

common stock to the acquiring firm for the offer price.

For example, on 8 August 2005, Quest Diagnostic announced that it was offering $43.90 in cash for

each publicly held share of LabOne Inc. The left panel of Figure 1 shows the LabOne share price. It

can be seen that the shares closed at $42.82 on 23 August 2005.This represents a 2.5% discount with

respect to the bid price. The deal closed successfully on 1 November 2005 (just over two months after

the announcement), generating an annualized return of 10.9% for the arbitrageur.

In a stock merger, the arbitrageur sells short the acquiringfirm’s stock in addition to buying the

target’s stock. The primary source of profit is the difference between the price obtained from the short

sale of the acquirer’s stock and the price paid for the target’s stock.

For example, on 20 December 2005, Seagate Technology announced that it would acquire Maxtor

Corp. The terms of the acquisition included a fixed share exchange ratio of 0.37 share of Seagate

Technology for every Maxtor share. Figure 2 shows the movement of both the acquirer share price

and the target share price. On December 21, Maxtor shares closed at $6.90 and Seagate at $20.21

yielding a $0.58 merger spread. The deal was completed successfully on 22 May 2006.

More complicated deal structures involving preferred stock, warrants, or collars are common. From

the arbitrageur’s perspective, the important feature of all these deal structures is that returns depend on

Page 55: Risk Metrics 2008

Introduction 53

Figure 2

Equity deal. Share prices of Maxtor (thick line) and SeagateTechnology (dotted line)

Seagate Technology share prices are rescaled by the exchange ratio.

31−Oct−05 31−Dec−05 28−Feb−06 30−Apr−0631−May−063

4

5

6

7

8

9

10

11

Date

Sha

re P

rice

mergers being successfully completed. Thus the primary risk borne by the arbitrageur is that of deal

failure. For example, on 13 June 2005, Vin Gupta & Co LLC announced that it was offering $11.75 in

cash for each share of infoUSA Inc. In the right panel of Figure 1, we see that after the

announcement, the share price of infoUSA jumped to that level. The offer was withdrawn, however,

on 24 August 2005, and the share price fell to a similar pre-announcement level.

A recent survey of 21 merger arbitrageurs (Moore, Lai, and Oppenheimer 2006) found that they invest

mainly in announced transactions with a minimum size of $100million and use leverage to some

extent. They gain relevant information using outside consultants and get involved in deals within a

couple of days after the transaction is announced. They unwind their positions slowly in cases where

the deal is canceled, minimizing liquidity issues. Their portfolios consist, on average, of 36 positions.

Finally, from Figure 3, we clearly see that the volatility ofthe share price before and after the

announcement is very different. Measuring the risk with a traditional VaR approach in terms of

historical volatility is surely wrong. Thus arbitrageurs typically control their risk by setting position

limits and by diversifying industry and country exposures.

We recently have developed a risk model suitable for a VaR approach that captures the characteristics

of these merger arbitrage deals (Daul 2007). In this article, we will refine this model to better describe

equity deals and also study in more detail the probability ofdeal success. The model will then be

tested on 4000+ worldwide deals and also compared to real hedge funds.

Page 56: Risk Metrics 2008

54 Extensions of the Merger Arbitrage Risk Model

Figure 3

Stock price of LabOne Inc. The large dot refers to the announcement date.

31−May−05 31−Jun−05 31−Jul−05 31−Aug−05 30−Sep−05 31−Oct−05Date

Sha

re P

rice

2 Risk Model

We consider only pure cash or equity deals, and introduce thefollowing notation (see Figure 4):

St is the stock price of the target firm at timet.

t0 is the announcement date.

Λ is the deal length.

Kt is the bid offer per share at timet.

For cash deals, the bid offer is typically fixed and known at announcement date,Kt = Kt0. For equity

deals, the bid offer is the acquirer stock priceAt times the deal conversion ratioρ, Kt = ρAt . This

difference will not affect our model as the main hypothesis applies when the deal is withdrawn.

Notice further that for cash deals, the bid offer can also change over time, for example if the offer is

sweetened or if a second bidder enters the game (Daul 2007).

The announcement datet0 is evidently fixed. The deal lengthΛ can fluctuate and is modeled as a

random variable following a distribution

F(t) = P(Λ ≤ t). (1)

Page 57: Risk Metrics 2008

Risk Model 55

Figure 4

Definition of parameters

31−May−05 31−Jun−05 31−Jul−05 31−Aug−05 30−Sep−05 31−Oct−05Date

Sha

re P

rice

Λ

t0

St0

K

We will consider a model conditioned onΛ, where at timet0+Λ (the effective date), we know if the

deal is completed (success) or withdrawn (failure).

To model this event, we introduce the binomial indicatorC. With probabilityπ, we haveC = 1,

indicating deal success, and with probability 1−π, we haveC = 0, indicating deal failure. In case of

success, the stock price of the target reaches its bid offer,while when the deal breaks we have to make

further assumptions. This will consist of our main hypothesis: we model the level to which the stock

price jumps as a virtual stock priceSt . Hence the stock price at the effective date is

St0+Λ =

Kt0+Λ if C = 1,

St0+Λ if C = 0.(2)

Since the withdrawal might be considered as negative information, the virtual stock price is subject to

a random shockJ at timet0+Λ. An illustration of this virtual stock price is shown in Figure 5. The

black line is the real stock price for a withdrawn deal, and the dotted blue line is a virtual path that the

stock price could have taken if no deal had been put in place.

The virtual stock price follows a simple jump-diffusion process

dSt = µStdt+σStdWt −JStdNt , (3)

whereµ is the drift (set to zero afterwards),σ is the volatility of the price before announcement,J is a

positive random shock following an exponential distribution with parameterλcashfor cash deals and

Page 58: Risk Metrics 2008

56 Extensions of the Merger Arbitrage Risk Model

Figure 5

Virtual stock price

31−Sep−05 31−Dec−05 31−Mar−06 30−Jun−06 30−Sep−06 30−Nov−0621

22

23

24

25

26

27

28

29

30

31

Date

Sha

re P

rice

−JSt

λequity for equity deals, andNt is a point process taking values

N(t) =

1 if t ≥ t0+Λ,

0 if t < t0+Λ.(4)

Finally, the initial condition is

St0 = St0. (5)

We can easily integrate this process and get fort < t0+Λ,

St = St0e∆Z (6)

where∆Z follows a normal distribution with mean(µ− 12σ2)(t − t0) and standard deviationσ

√t− t0.

For t = t0+Λ we get

St0+Λ = St0e∆Z(1−J). (7)

Page 59: Risk Metrics 2008

Parameters estimation and model validation 57

3 Parameters estimation and model validation

3.1 Virtual stock price

The parameters of the model are estimated using historical information on deals. The transaction

details (such as announcement date, effective date, type ofdeal, and so forth) are obtained from

Thomson One Banker. We consider pure cash or equity deals between public companies from 1996 to

2006 worldwide where the target company offered value is over $100 million. We consider those

deals for which we can also obtain stock prices from DataMetrics.

The daily driftµ is set to zero, and the ex-ante deal daily volatility is estimated using one year of daily

returns, equally weighted.

The intensity parametersλcashandλequity of the shock are obtained by moment matching. Conditional

on deal failure, the expected value of the stock price is

E[St0+Λ|C = 0] = St0e(µ−σ2/2)Λ(1−λ·). (8)

Assumingµ−σ2/2≈ 0, we get

E

[St0+ΛSt0

∣∣∣∣C = 0

]= (1−λ·). (9)

Using the 131 withdrawn cash deals in our database, we getλcash= −0.07±0.06; using the 33

withdrawn equity deals, we getλequity= 0.2±0.1. Hence we set

λcash= 0 and λequity = 0.2. (10)

3.2 Deal length

We model the deal lengthΛ with a Weibull distribution having parametersa andb,

F(t) = 1−e−( ta)

b

. (11)

This distribution is assumed to be universal. Using 1075 realized deal lengths (measured in days), we

obtain the following boundaries at 95% level of confidence:

143< a < 154 (12)

1.43< b < 1.56 (13)

This corresponds to an average deal length of

L = 135 days. (14)

Page 60: Risk Metrics 2008

58 Extensions of the Merger Arbitrage Risk Model

3.3 Test of the main hypothesis

As stated above, the main hypothesis is the “existence” of a virtual stock price that is reached only in

case of withdrawal. For cash deals,λcash= 0, meaning the stock prices after withdrawal should follow

a lognormal distribution, with volatilityσi different for each deali. Hence, the normalized residuals

ui =

log

(Si

t0+ΛiSi

t0

)

σi√

Λi(15)

should follow a standard normal distribution. Thep-value of a Kolmogorov-Smirnov test using the

131 withdrawn deals is 93%, implying that we cannot reject atall the main hypothesis.

For equity deals,λequity = 0.2, and the residuals defined as above do not follow a normal distribution.

Instead we study the residuals

vi =Si

t0+Λ

Sit0

. (16)

This should be distributed as

e∆Zi(1−J), (17)

where∆Zi follows a normal distribution with parameterσi different for each deal. We set the

volatility equal to the average of theσi , and use Monte Carlo to obtain a sample distributed according

to (17). We then compare this sample to our 33 withdrawn dealsusing a two-sample

Kolmogorov-Smirnov test. The result is ap-value of 53%. Again we cannot reject at all the

hypothesis, confirming the validity of our model.

4 Risk measurement application

We want to measure the risk of a sample portfolio consisting of 30 pure cash deals pending end of

2006. All deals are described by

• the target company,

• the bid offerK,

• the date of announcementt0,

• the probability of successπ,

Page 61: Risk Metrics 2008

Risk measurement application 59

Table 1

VaR using the merger arbitrage risk model and the traditional risk model

VaR level Merger Arb Model Traditional Equity Model

95% 1.37% 7.25%

99% 2.21% 10.24%

Table 2

Dispersion of historical VaRs for merger arbitrage hedge funds

VaR level 1st quartile median 3rd quartile

95% 0.81% 1.29% 1.68%

99% 2.17% 2.92% 4.90%

and are assumed independent. We set the probability of successπ to the historical value of 86% (see

Section 5).

We forecast the P&L distribution of the portfolio at a risk horizon of one month using Monte-Carlo

simulations. For each deal, one iteration is as follows:

1. Draw an effective date using the Weibull distribution.

2. If the risk horizon is subsequent to the effective date, draw a completion indicator. If the risk

horizon is before the effective date, the deal stays in place.

3. If the completion indicator indicates failure, draw a virtual stock price, and calculate the loss. If

the completion indicator indicates success, calculate theprofit.

Table 1 reports the VaRs at two different confidence levels obtained from model, as well as VaRs

obtained from modeling the positions as simple equities following a log-normal distribution. We

notice that our model produces lower risk measures, consistent with our expectation.

For more evidence, we compare these monthly VaRs with the historical monthly VaRs of 41 merger

arbitrage hedge funds obtained from the HFR database. Table2 shows that the dispersion of the hedge

fund VaRs contains our model’s results. We conclude that ournew model consistently captures the

risk of a merger arbitrage hedge fund, and that the traditional model likely overstates risk.

Page 62: Risk Metrics 2008

60 Extensions of the Merger Arbitrage Risk Model

5 Probability of success

In the risk measurement application above, the probability of success was unconditional on the deal,and set to the historical estimate using all deals worldwide from 1996 to 2006

πhistorical =NsuccessNtotal

=41764879

= 86%. (18)

A deal-specific probability of success can be inferred from the observed spread in the market as in(Daul 2007),

πimplied = π (∆, St0 ,K, rfree) . (19)

Alternately, we may fit an empirical model. We will use a logistic regression and assume that theprobability of success is a function of observable factors Xi as

πempirical =1

1+ e− ∑i biXi. (20)

If the factor sensitivity bi is positive, then larger Xi lead to higher probability of success, assumingother factors are constant.

We consider the following factors:

• Target attitude:

Xi =

1 Friendly0 Neutral

− 1 Hostile

• Premium: the relative extra amount the bidder offers. Its magnitude should be an indicator ofthe acquirer’s interest.

Xi =K − St0

St0

• Multiple: the ratio of enterprize value (EV), calculated by adding the target’s debt to the dealvalue, to the EBITDA, an accounting measure of cash flows.

Xi =EV

EBITDA

• Industrial sector: By acquiring a target in the same industrial sector, the acquirer increases itsmarket share. This could influence deal success.

Xi =1 same sectors0 different sectors

Page 63: Risk Metrics 2008

Probability of success 61

Table 3

Logistic regression on 1322 deals

Factor bi p-value

Constant -1.09 0.04

Target attitude 1.79 0.00

Premium 0.76 0.17

Multiple 0.44 0.00

Industrial sectors 0.33 0.15

Relative size 0.44 0.00

Deal type 0.34 0.16

Trailing number of deals -0.29 0.19

• Relative sizeof acquirer to the target

Xi = log

(Acquirer assets

EV

).

• Deal type

Xi =

1 cash

0 equity

• Trailing number of deals. The number of deals is cyclical; the position in that cycle should

influence deal completion.

Xi =Ndeals in last 12 monthsyearly average ofNdeals

We have 1322 realized deals (completed or withdrawn) with all factors available. Table 3 shows the

results obtained from the logistic regression. We see that attitude, multiple and relative size are very

relevant factors (very smallp-values). The premium, having the target and the acquirer inthe same

industrial sector and the deal type are relevant to some extent. The sensitivity for the trailing number

of deals is counterintuitive: it appears that a large numberof deals announced might catalyze less

convincing deals.

To assess the predictive power of our model we perform an out-of-sample test, and compute the

so-called cumulative accuracy profile (CAP) curve. The model parameters are fit using the 873 (66%)

oldest deals. We then infer the probability of success for the remaining 449 (34%) deals. After sorting

the deals by their probability of success obtained with the statistical model (from less probable to

most probable), the CAP curve is calculated as the cumulative ratio of failures as a function of the

cumulative ratio of all deals.

Page 64: Risk Metrics 2008

62 Extensions of the Merger Arbitrage Risk Model

Figure 6

CAP curve for the out-of-sample test (OOS) and the implied probability of success

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

OOS implied

The 449 out-of-sample deals have an overall failure ratio of10.2%. If the model were perfect, then

the first 10.2% of deals as sorted by our model would have contained all of the failed deals, and we

would have CAP(x) = 1 for x≥ 10.2%. If the model were useless, the ordering would be random, and

we would have CAP(x) = x. In Figure 6 we show the result for the out-of-sample test, the

market-implied probability of success and the two limitingcases. We clearly see that our model beats

the market, suggesting that there is a potential “alpha” formerger arbitrage hedge funds. Further

looking closer at the lower left corner we notice that the CAPcurve for the statistical model follows

the perfect limiting case up to about 5%. This means that our statistical model ranks the first half of

the withdrawn deals perfectly as the worst ones.

6 Conclusion

The specifics of merger arbitrage deals can be captured introducing a binomial completion indicator

and a virtual stock price modeled as a simple jump-diffusionprocess. This model has been validated

using a large set of deals. A merger arbitrage hedge fund would benefit from using this model to

measure the risk of his portfolio in a VaR framework and/or perform stress tests using the probability

Page 65: Risk Metrics 2008

Conclusion 63

of deal success for example.

Finally we have developed a statistical model for the probability of success and showed in a

out-of-sample analysis that its forecasting power is superior to the market predicting power.

References

Daul, S. (2007). Merger arbitrage risk model.RiskMetrics Journal 7(1), 129–141.

Moore, K. M., G. C. Lai, and H. R. Oppenheimer (2006). The behavior of risk aribtrageurs in

mergers and acquisitions.The Journal of Alternatives Investments Summer.

Page 66: Risk Metrics 2008

64 Extensions of the Merger Arbitrage Risk Model

Page 67: Risk Metrics 2008

Measuring the Quality of Hedge Fund Data

Daniel StraumannRiskMetrics Group

[email protected]

This paper discusses and investigates the quality of hedge fund databases. The accuracy of hedgefund return data is taken for granted in most empirical studies. We show however that hedge fundreturn time series often exhibit peculiar and most likely “man-made” patterns, which are worth tobe recognized. We develop a statistical testing methodology which can detect these patterns.Based on these tests, we devise a data quality score for rating hedge funds and, more generally,hedge fund databases. In an empirical study we show how this data quality score can be usedwhen exploring a hedge fund database. Thereby we can confirm many of the insights by (Liang2003) concerning the quality of hedge fund return data and made by different means. In a last stepwe try to estimate the impact of imperfect data on performance measurement by defining a “dataquality bias”. The main goals of this paper are to increase the awareness for the practicallimitations of hedge fund data and to suggest a tool for the quantification of financial data quality.

1 Introduction

The past years have seen a rapid growth of the hedge fund industry and an enormous increase of the

assets that this investor segment controls. Originally only accessible for institutional investors or very

wealthy individuals, hedge funds are nowadays much better established in the broad public. In many

countries, even retail investors can place money into hedgefunds.

There are several reasons why hedge funds have been so successful in attracting new money. Hedge

fund risk-return profiles are perceived as superior to classical long-only mutual. Hedge funds are

flexible and basically unregulated. In contrast to mutual funds, any kind of financial investment is

permitted. Hedge funds may for instance go short, invest thecapital into futures, derivative securities,

commodities and other asset classes that are not accessiblefor mutual funds. Furthermore, they can

borrow money in order to create leverage on their portfolio.Many hedge funds seek to achieve

absolute returns. Therefore, they have no traditional benchmark such as a stock or bond index or a

blend of indices. Mutual funds, however, are required to more or less track a benchmark and are

therefore much more exposed to bearish market conditions. And indeed, a majority of hedge funds

did convincingly well in the aftermath of the burst of the technology bubble and the September 11

terrorist attacks.

Page 68: Risk Metrics 2008

66 Measuring the Quality of Hedge Fund Data

Parallel to the success and the maturing of the hedge fund industry, academics were beginning to take

an interest in how hedge fund managers achieve their profits and whether successful track records are

due to skill or just luck. The questions of sources of hedge fund returns and performance persistence

have been addressed in many empirical studies, and the literature on these topics is still growing. It is

needless to say that this research heavily relies on hedge fund returns data and statistical methods to

analyze them. Concerning the data, several providers offerhedge fund databases. These databases

differ substantially in coverage of funds and in the information beyond the return time series. The

hedge fund database business is rather fragmented. It does not seem that there is a “golden standard”

for hedge fund returns. As a matter of fact, no database exists that would provide full coverage. The

diversity of hedge fund databases used in articles can explain dissimilar quantitative results.

While increasingly complicated stochastic models are being used for the description of hedge fund

returns, there are certain limitations on the data side. These limitations are only marginally discussed,

if not neglected. The purpose of this paper isnot to provide refinements of models or to present yet

another large empirical study. Instead, we are concerned about what forms the backbone of hedge

fund research: hedge fund data. Our focus lies on data quality aspects, a topic that is somewhat

disregarded in the literature. From our experience, hedge fund return data is not always beyond all

doubts. To assess the plausibility of hedge fund return data, we propose an objective and

mathematically sound method. We then use this method to analyze a hedge fund database. We also

quantify the impact which imperfect data may have.

1.1 Issues with hedge fund data

We have recently examined hedge fund databases of several providers, and have come to doubt the

quality of the return data. In this paper, we uniquely work with the Barclay1 database.In all

databases similar issues were identified.

To illustrate the aforementioned issues, we consider the following example of an active onshore

long-short hedge fund. Its return and asset under management time series are displayed in Table 1.

The following peculiarities are striking:

• The returns of the year 2000 are repeated in 2001. This is obviously a serious data error.

Interestingly, the assets under management do not show any recurring patterns.

1Barclay Hedge is a company specialized in the field of hedge fund and managed futures performance measurement

and portfolio management; see www.barclayhedge.com. We benefited from the excellent support by Sol Waksman and his

client service team of Barclay Hedge. This is gratefully acknowledged.

Page 69: Risk Metrics 2008

Introduction 67

Table 1

Monthly returns of an active long-short hedge fund, January1999 through December 2002“AUM” stands for assets under management (in $M). The full time series covers January 1991 throughSeptember 2007. The symbol “+” signifies that the return value appears at least twice in the entire timeseries. The boxes frame the blocks of recurring returns.

Date AUM Return (%) Date AUM Return (%)

Jan–1999 10.7 0.20 + Jan–2001 22.5 4.40 +

Feb–1999 10.7 6.50 Feb–2001 22.5 0.20 +

Mar–1999 15.5 3.70 + Mar–2001 34.0 0.00 +

Apr–1999 15.5 -6.30 Apr–2001 34.0 5.40 +

May–1999 15.5 -0.90 + May–2001 34.0 6.40 +

Jun–1999 23.8 2.90 + Jun–2001 43.0 0.40 +

Jul–1999 23.8 0.10 + Jul–2001 43.0 1.10 +

Aug–1999 23.8 4.10 + Aug–2001 43.0 -2.60 +

Sep–1999 28.1 -0.80 + Sep–2001 48.0 -8.60 +

Oct–1999 28.1 -2.10 + Oct–2001 48.0 4.00 +

Nov–1999 28.1 -3.00 Nov–2001 48.0 3.50 +

Dec–1999 26.8 5.70 Dec–2001 48.0 3.10 +

Jan–2000 26.8 4.40 + Jan–2002 48.0 0.90 +

Feb–2000 26.8 0.20 + Feb–2002 50.0 -0.90 +

Mar–2000 36.1 0.00 + Mar–2002 50.0 3.40 +

Apr–2000 36.1 5.40 + Apr–2002 50.0 1.80 +

May–2000 36.1 6.40 + May–2002 51.5 1.60 +

Jun–2000 28.0 0.40 + Jun–2002 51.0 -0.90 +

Jul–2000 28.0 1.10 + Jul–2002 50.0 -5.70

Aug–2000 28.0 -2.60 + Aug–2002 51.0 -0.80 +

Sep–2000 25.0 -8.60 + Sep-2002 51.0 -2.30

Oct–2000 25.0 4.00 + Oct–2002 53.0 -0.60 +

Nov–2000 25.0 3.50 + Nov–2002 60.0 3.30 +

Dec–2000 22.5 3.10 + Dec–2002 62.0 1.10 +

Page 70: Risk Metrics 2008

68 Measuring the Quality of Hedge Fund Data

• The returns appear to be rounded.

• Many return values appear more than once in the time series (depicted by the symbol “+”).

Note that this is partially caused by the rounding.

• Appearance of zero returns (for instance in March 2000). Itis rather unlikely that a fund has

returnsexactlyequal to zero.

The recurrence of one year of return data is clearly a seriouserror. We admit that in the Barclay

database such extreme problems appear for a handful of fundsonly. Much more frequent is the

recurrence of blocks of length two or three. For instance, the January to March returns of a certain

year would recur in one of the subsequent years. We picked thelong-short fund because it provides an

exemplary illustration of all types of problems that we haveencountered. Once again, it must be

stressed that such irregularities are by no means restricted to Barclay, but were evident inall databases

we examined.

An important question is why for this particular fund the data quality is so poor. One argument could

be that the fund is exposed to illiquid markets or instruments, which would make an accurate

valuation difficult. In the Barclay database, we find the following description of the fund’s investment

objectives:

Long/Short stocks and other securities and instruments traded in public markets.

Emphasis is to manage the portfolio with near zero beta. Focus on companies with market

capitalization between 200 mm and 2.5 bb. Uses quantitativescreening and fundamental

analysis to identify undervalued equities with strong cashflow to purchase. The short

strategy uses proprietary quantitative screens and fundamental analysis to identify short

opportunities with a non-price based catalyst, potential for negative earnings surprise, and

overvaluation. Overvaluation is a necessary but not sufficient condition to be short.

This description indicates that the fund’s positions are probablynotparticularly illiquid, and that it

should be feasible to supply an exact valuation once per month. It appears either the fund does have

valuation difficulties, or at minimum that they do not reportthe exact valuations to Barclay.

Since the reported monthly performance numbers appear unreliable, one might postulate that the

long-short fund in question doesin generalnot properly value its assets every month and that

therefore only crude return estimates are reported. Two arguments speak against this. First, the

long-short fund is audited every December, an information provided by the Barclay database and

absolutely plausible in view of the fund size. Therefore, one would expect that in December some sort

Page 71: Risk Metrics 2008

Introduction 69

of equalization is applied, meaning that the December return is determined in such a way that the

actualone-year return is matched. This return figure would most likely be a number withtwo digits

after the decimal. However, for the fund in question we do notsee any numbers with more than one

digit after the decimal. Second, the long-short fund is openfor new investments and subscription

possible every month. This would imply that the fund is able to quantify the net asset value of its

holdings on a monthly basis.

Finally, the information reported to data providers is not audited by a third party and cannot be

thoroughly reviewed by the data provider due to the mass of funds that report. One has to be also

aware that hedge funds reportvoluntarilyand therefore the willingness of fund managers to revise

numbers in order to ensure data accuracy might be rather limited.

Concluding our reasonings, the most likely reason for the questionable return history is a certain

negligence exercised by the fund when reporting to Barclay.

1.2 Goals and organization of the paper

The lesson from the previous example is that hedge fund return data can be problematic. While the

conclusions from empirical hedge fund research might be unaffected in qualitative terms, it is clear

that inaccurate data could have an impact when industry-wide numbers such as performance, Sharpe

ratio or alpha are calculated. In the past, people have careda lot about data biases such as the

survivorship bias, which occurs when failed hedge funds areexcluded from performance studies. The

survivorship bias generally leads to an overstatement of performance. The accuracy of return data

itself is however mostly taken for granted, and the impact ofthe data quality on the analysis is rarely

questioned. This is in a surprising contrast to the attention paid to the “classical” data biases. This

paper tries to fill a gap and to increase the awareness for the practical limitations of hedge fund data.

We regard inaccurate data as another cause of performance misrepresentation. In this context we

would like to introduce a new terminology:data quality bias. It is by no means our aim to excoriate

hedge fund data providers, which are reliant on hedge funds reporting accurately. The providers’

hands are tied when it comes to verification of the performance numbers. Maybe this paper

contributes to preparing the ground for improvement of hedge fund data quality in the future.

The only way for assessing the accuracy of hedge fund returnsin a systematic and objective manner is

via a quantification. We first devise tests that detect the kind of problems discussed above. The results

of these tests are then combined into a single number, which serves as a measure or score for how

plausible a hedge fund return time series is. A score of a group of funds is just the average score. We

Page 72: Risk Metrics 2008

70 Measuring the Quality of Hedge Fund Data

will also call it data quality score. Applying the data quality score to the Barclay hedge fund data, we

provide a small study showing results which often have an intuitive explanation. It is important to be

aware that our score rates the quality of returns only. It disregards other aspects of data quality.2

A score for data quality is a powerful tool and can serve multiple purposes. It allows for instance

comparison of different groups of funds with respect to dataaccuracy. We mention that, if properly

adapted, the ideas and principles presented in this paper can be applied to any kind of financial data.

Besides identifying problematic samples, a score of data quality helps one to monitor the

improvement of data quality over time and confirm the effectiveness of data improvement measures

and differentiate between competing databases.

The paper is organized as follows. In a section on preliminaries, we provide a brief survey on the

hedge fund literature. In the subsequent section we introduce and discuss our data quality score. This

is then followed by an empirical study using the Barclay database. Finally we conclude.

2 Preliminaries

First we give a classification of the hedge fund literature. Such an overview helps to make the

connection between this paper and the literature. Secondlywe provide more details on hedge fund

databases and their biases. Lastly, we cite and summarize the literature on hedge fund data quality

that we are aware of.

2.1 Classification of hedge fund literature

As mentioned earlier, virtually any hedge fund research relies on return data. There are basically three

interrelated main streams of academic research, addressing the following matters:

Hedge fund performance persistence

Here the main question is whether the performance achieved by a fund relative to its peers3 is

consistent over time, or in other words whether the outperformers of a certain time period are likely to

remain outperformers for the next time period, and vice versa. Miscellaneous methods have been

2For a readable survey on data quality, we refer to (Karr, Sanil, and Banks 2006).3That is, other hedge funds which pursue a similar investmentstrategy

Page 73: Risk Metrics 2008

Preliminaries 71

applied and the hedge fund databases of various providers were used in order to investigate whether

hedge fund performance persists. The answers are mixed, andit seems that the community has not yet

reached a consensus. We refer to (Eling 2007) for a comprehensive overview of the literature on

hedge fund performance persistence.

Sources of hedge fund returns

We have used the terms “performance” and “track record” without being explicit. Comparisons and

rankings of hedge funds (or any investments) based on raw returns would not be sensible because one

would neglect the risks that have been taken in order to achieve these returns. Therefore risk-adjusted

performance measures should be used when ranking funds. Themost classical measure is the Sharpe

ratio. Since the statistical properties of hedge fund returns are however often not in accordance with

the Gaussian law (skewness and fat tails), many people resort to a generalized Jensen’s alpha. Alpha

is the regression intercept of a multi-factor linear model,and (together with the mean-zero

idiosyncratic return) represents the skill of the manager,that is, what is unique about the manager’s

investment strategy. Building a factor model that containsall common driving factors (or sources of

hedge fund returns), is not trivial. Capturing hedge funds’trading strategies through a linear model

requires the use of non-linear factors. There have been manyproposals for hedge-fund-specific style

factors; see for instance (Hasanhodzic and Lo 2007).

Hedge fund return anomalies

This topic is probably closest to the main theme of this paper, and for this reason we elaborate a bit

more on it. While the question about economic sources of returns searches for the factors that

determine the performance of hedge funds, the anomalies research stream rather deals with what one

could call the fine structure of hedge fund return processes,or in other words, the peculiarities that

they exhibit.

Hedge fund returns are mostly not available at frequencies higher than monthly. There are several

reasons for this. Unlike mutual funds, hedge funds are privately organized investment vehicles and

often not subject to regulation; therefore there are no binding reporting standards. Moreover, there is

still secrecy around hedge funds. Managers are reluctant todisclose information on a daily basis, even

something as basic as realized returns. This is particularly the case for those that trade in illiquid

markets, because it is feared that disclosed information could be abused by competitors. From an

operational point of view, managers do not want to have the burden of daily subscriptions or

Page 74: Risk Metrics 2008

72 Measuring the Quality of Hedge Fund Data

redemptions necessitating the calculation of daily returns because they want to have the freedom to

fully concentrate on investment operations.

A great deal of the anomalies literature is concerned with the serial correlation of hedge fund returns.

The occurrence of pronounced autocorrelations is remarkable because it seems to contradict the

efficient markets hypothesis. However, due to lock-out and redemption periods, it would hardly be

possible to take advantage of these autocorrelations. Getmansky, Lo and Makarov (2004) show that

serial correlations are most likely induced by return smoothing. These authors argue that the exposure

of the fund to illiquid assets or markets leads to return smoothing when a portfolio is valued. This also

explain why funds that invest in very liquid assets (such as CTAs4 and long-only hedge funds) rarely

show significant autocorrelations. Getmansky, Lo and Makarov (2004) propose the use of

autocorrelations as a proxy for a hedge fund’s illiquidity exposure. They moreover stress that the

naive estimator overstates the Sharpe ratio because it ignores the autocorrelation structure of hedge

fund returns.

Bollen and Pool (2006) go a step further and insinuate that some managers might smooth returns

artificially by underreporting gains and diminishing losses, a practice they call “conditional

smoothing”. The authors also devise a screen to detect fundsthat apply conditional smoothing. Such

screens could be used by investors or regulators as an early warning system. Conditional smoothing

does not necessarily imply purely fraudulent behavior of the managers. It can be partially explained

by the pressure that they face in order to accord with the widespread myth of hedge funds as

generators ofabsolutereturns. However, history shows that many hedge fund fraud cases came along

with misrepresentations of returns and so it might be worthwhile to have a closer look at those funds

which appear to misrepresent returns.

In a subsequent paper (Bollen and Pool 2007), the same authors look at the distribution of hedge fund

returns and give evidence that it has a discontinuity at zero. Again, the explanation is the tendency to

avoid reporting negative returns. It is tempting for a hedgefund manager to report something like

0.02% for an actual return of, say -0.09%. If such practices are followed by a non-negligible number

of managers, a discontinuity at zero will occur. The authorstest for other possible causes, but return

misrepresentation turns out to be most likely.

Similar in nature is a study by (Agarwal, Daniel, and Naik 2006), which shows that average hedge

fund returns are significantly higher during December than during the rest of the year. The analysis

indicates that the “December spike” is most likely related to the compensation schemes of hedge

4CTA stands for commodity trading advisor and actually comesfrom legal terminology. The CTA strategy is also

referred to as managed futures. A CTA manager follows trendsand invests in commodities, currencies, interest rates,

futures and other liquid assets.

Page 75: Risk Metrics 2008

Preliminaries 73

funds. These tempt managers to inflate December returns in order to achieve a better year-end

performance, which in turn leads to higher compensation. Anequalization is then made in the

subsequent year. Another piece of research giving evidencethat hedge fund managers are driven by

the incentive structure is (Brown, Goetzmann, and Park 2001). There, it is shown that hedge funds that

perform well in the first six months of the year tend to reduce the volatility in the second half of the

year. It seems very hard to avoid such behavior other than through modifying the incentive systems.

2.2 Hedge fund databases and biases

Hedge fund data is marketed by several providers. Some are small vendors focusing on hedge fund

data alone, whereas others operate within a large data provider company that covers a variety of other

financial segments. Currently there are about twenty providers, many of which offer additional

services such as hedge fund index calculation.

The various hedge fund databases differ in coverage and in the information supplied besides pure

returns or assets under management. The differences with respect to coverage are considerable. The

estimated coverage of the largest databases is no more than 60% of all hedge funds.5 A reason for this

is the fact that managers typically report to one or two, but hardly to all existing databases. Some

funds prefer not to report at all, particularly if they are not interested in attracting new investors. Some

providers are specialized in the collection of data of certain hedge fund segments and would even

exclude others.6 For all these reasons, there is no database yet which fully represents all hedge funds.

Indices constructed based on one database share again the problem of inadequate representation of the

hedge fund universe. This issue led EDHEC, a French risk and asset management research institution

supported by universities, to construct a family of indicesof hedge fund indices. These indices are

meant to combine the information content of the various dataprovider indices in an optimal fashion;

see (Edhec-Risk Asset Management Research 2006).

Apart from inadequate representation, which leads to biased estimates of performances,there are other

data biases which play an important role. Numerous articlesdiscuss these biases and provide

estimates of their magnitude.

Survivorship bias. This bias, mostly upward, is created when funds that have been liquidated, or

have stopped reporting, are removed from the database. Survivorship bias has also been

discussed in the mutual fund performance literature, but itis particularly pronounced for hedge

5See (Lhabitant 2004).6As an example, HFR excludes CTAs from their database.

Page 76: Risk Metrics 2008

74 Measuring the Quality of Hedge Fund Data

funds because their attrition rate is significantly higher than that of mutual funds. Nowadays,

providers are aware of this issue and make sure that collected data of defunct funds does not get

erased. Most hedge fund providers offer so called graveyarddatabases, this means, databases

containing the “dead” funds.

Backfill bias. This bias occurs when hedge funds that join a database are allowed to report some of

their past return history. This again leads to overstatements of the historical performance of all

funds because most likely hedge funds will start reporting during a period of success. A simple

remedy to limit this bias is to record the date when the fund joined the database.

Self-reporting bias. Recall that hedge funds report voluntarily. There might be differences between

reporting and non-reporting funds, and it is difficult to quantify. An indirect way is to look at

funds-of-hedge funds. The performance of funds-of-hedge funds can serve as a proxy for the

performance of the “market portfolio” of hedge funds.

Another big difficulty leading to potentially distorted numbers is the style classification of hedge

funds. First, the style classification used by a provider is hardly ever perfect. Second, the style is

self-proclaimed by the manager. Third the investment stylepursued by a fund may change over time,

but the databases we know do not treat style as a time series item. A lot more could be said about

hedge fund databases and their biases; see the excellent survey given in (Lhabitant 2004).

2.3 Hedge fund data quality

Note that the biases presented above are uniquely connectedto the way the providers collect and

manage data, and to the willingness of managers to report returns and other information. Most of

these studies take theaccuracyof hedge fund returns for granted. We are aware of two papers raising

questions regarding this assumption. In (Liang 2000), differences between the HFR and TASS

databases are explored. The returns and NAVs of funds that appear in both databases are compared.

The returns coincide in about 47% of the cases only and the NAVs in about 83% of the cases. The

second article (Liang 2003) finds that the average annual return discrepancy between identical hedge

funds in TASS and the US Offshore Fund Directory is as large as0.97%. Liang also compares

onshore versus offshore equivalents of TASS and different snapshots of TASS. Furthermore he

identifies factors which influence return discrepancies by means of regressions. He finds that audited

funds have a much lower return discrepancy than non-auditedfunds. Moreover, funds listed on

exchanges, funds of funds, funds open to the public, funds invested in a single industrial sector and

funds with a low leverage have generally less return discrepancies than other funds. Similarly to us,

Page 77: Risk Metrics 2008

A data quality score 75

Liang questions the accuracy of hedge fund return data. He measures data quality in terms of return

discrepancies across databases. Liang does however not ascertain which of the two data sources is

more credible and therefore of higher quality.

Our paper has a similar scope, but we highlight two differences in our approach:

• We rate the quality of a single database and the funds therein in absolute terms. We do not

depend on comparisons across databases. In contrast, Liangquantifies data quality in relative

terms by looking at return discrepancies.

• We can assess the data quality ofall funds since we do not rely on matching funds between

different databases. In contrast, Liang’s approach can only determine the data quality for the

intersection of funds in two databases.

3 A data quality score

In this section, we devise a quality score for fund return time series. Inspired by the patterns found in

the long-short hedge fund of the introduction, we first definefive statistical tests for time series of

returns. For a fund, the quality score is the number of rejected tests. For a group of funds, the quality

score is the average of the fund scores. For illustrative purposes we finally compare the quality of

stock returns and fund of hedge fund returns.

3.1 Testing for patterns

As announced, we design five tests to detect patterns in return data. A test is rejected if the return data

exhibits the corresponding pattern. We suppose that the monthly returns are expressed as percentages

with two decimal places as in Table 1. We begin by describing the tests in a rather loose way:

1. For testT1, the numberz1 of returns exactly equal to zero is evaluated. Ifz1 is “too large”,T1 is

rejected.

2. TestT2 is based on the inversez2 of the proportion of unique values in the time series. Ifz2 is

“too large” (or, equivalently, the proportion of unique values is too small),T2 is rejected.

3. TestT3 looks at runs of the time series. A run is a sequence of consecutive observations that are

identical. To give an example,(2.31, 2.31, 2.31) would be a run of length three. If the lengthz3

Page 78: Risk Metrics 2008

76 Measuring the Quality of Hedge Fund Data

of the longest run is “too large”,T3 is rejected.

4. In testT4 the numberz4 of different recurring blocks of length two is evaluated. A block is

considered as recurring if it reappears in the sequence without an overlap. For example,

consider the sequence

(1.25, 4.57,−2.08, 8.21, 8.21, 8.21, −0.55, 1.25, 4.57, −2.08, 6.42, 1.25, 4.57, −2.08).

The sequence contains two different recurring blocks of length two: (1.25,4.57) and

(4.57,−2.08). Note that(8.21, 8.21) is not a recurring block because of the no overlap rule.

The testT4 is rejected ifz4 is “too large”.

5. TestT5 is based on the sample distribution of the second digit afterthe decimal. If this

distribution is “unlikely”,T5 is rejected.

It is evident that there are overlaps between the five tests. TestsT2 andT5 check for concentration of

return values and rounding, whereasT3 andT4 are meant to uncover repetitions in the data.

So far we have been unspecific about the thresholds for rejecting the tests. The role of the thresholds

is to discriminate between patterns appearing just by chance and those that are caused by real

problems in the data. Fixed thresholds would not be useful, since the longer the time series, the more

likely certain features such as recurring blocks occur by chance. Moreover, the volatility plays an

important role; funds with a very low volatility will feature a high concentration in certain return

values because the range of the data is limited.

We thus set the thresholds on a per time series basis. To this end, we suppose that monthly fund

returns are independent and identically distributed normal random variables rounded to two digits

after the decimal:

rt i.i.d.∼ N (µ,σ2), t = 1, . . . ,n. (1)

Here the notationN highlights that the normal random variables are rounded, and n denotes the length

of the return time series. Under the distributional assumption (1) we next compute for each testTi the

probability that the corresponding test statisticZi is equal to or larger than the actually observedzi :

pi = Pµ,σ2;n(Zi ≥ zi). (2)

If this probability is small, it implies that we have observed an unlikely event and so the pattern can be

considered as significant. Note thatpi is thep-value of the testTi under the null hypothesis (1).

Instead of working with thresholds, we can equivalently uselevels of significance and reject tests if

the p-values exceed these levels. We chose to take a common level of significance equal to 1%

Page 79: Risk Metrics 2008

A data quality score 77

because this makes all tests comparable. Summarizing, we have:

rejectTi ⇐⇒ pi < α = 1%. (3)

Now there are a couple of practical issues to consider. For computing thep-values, we replace the

unknown parametersµ andσ2 in (2) by the sample mean ˆµ and by the sample varianceσ2 of the

returns, respectively. The numerical values ofpi are then obtained by Monte Carlo simulation.

We have yet to define the test statisticZ5. We first determine, via Monte Carlo simulation, the

distribution of the second digit after the decimal under (1)with µ= µ andσ2 = σ2. The probability

that this digit is equal tok is denoted byqk. We have found that for the range of volatilitiesσ ≥ 0.5

the digit is close to being equidistributed on0,1, . . . ,9. For a sample ofn returns, the number of

occurrences ofk as the second digit after the decimal is denoted bynk. We defineZ5 as the distance

between the sample distribution of the second digit and its distribution under (1). This distance is

measured through the classicalχ2 goodness-of-fit test statistic:

Z5 =9

∑k=0

(nk−nqk)2

nqk. (4)

Note that we use Monte-Carlo simulation for the calculationof the p-values ofT5; we do not resort to

a χ2 approximation of the distribution ofZ5.

To conclude the definition of the tests, a couple of remarks are warranted. From visual inspection of

return time series we have developed a sense of imperfect data. We have concentrated on patterns in

sequences of returnnumbers. This resulted in the testsT1-T5. Of course these tests are not necessarily

exhaustive since there might be other patterns which we are not aware of. Although we believe that

testing for faulty outliers of extreme returns would be of high relevance, the only feasible way of

doing that would be via a comparison of the identical fund across various databases, which we did not

pursue.

It is evident there is a speculative element in our method. Wecan merely conjecture that a certain

return time series is inaccurate. The only way to validate our approach would be to call up all the

funds with problematic data. It should be clear that this is beyond the scope of this paper.

An assumption which might lead to objections is the hypothesis (1) of i.i.d. normally distributed

percentage returns (rounded to two decimal places after thedecimal). This model is needed to

estimate thep-values, and we are aware that it is crude. It could be easily replaced by a model

allowing for skewness and heavy tails. Note that our tests are of discrete nature and rather indifferent

about the return distribution. For this reason we conjecture that the approach is to some degree robust

with respect to the chosen model for the return distribution. At least on a qualitative level we expect

that the distributional assumptions have little impact on the results of Section 4.

Page 80: Risk Metrics 2008

78 Measuring the Quality of Hedge Fund Data

Another choice we have made is the level of significance α = 1%, which is somewhat arbitrary, aswith any statistical testing problem. We have taken a rather low α because we wish to be prudentabout rejecting funds and want to keep the Type I error7 low. Another reason for taking a low α is theincrease of the Type I error if the five tests are applied jointly. The Type I error of the five tests appliedjointly is smaller than or equal to 5%. This is a consequence of the Bonferroni inequality:

Pµ,σ2;n ( at least one Ti is rejected) ≤5

∑i= 1

Pµ,σ2;n (Ti rejected) = 5α . (5)

3.2 Definition of the quality score

The data quality score of a fund is just the number of rejected tests. Using (3), the score can beformally written as

s =5

∑i= 1

I Zi< pi . (6)

Note that high values of S correspond to a low data quality, and vice versa. For a group F of funds,the score is the average fund score:

S =1|F | ∑

j Fs j . (7)

Note that S is the average number of rejected tests (per fund) and lies between zero and five. Ifhypothesis (1) holds true for each fund, we have that

E (S) =1|F | ∑

j FE (s j) ≤ 5α, (8)

since Pµ,σ2;n(Zi < pi) ≤ α; note that we don’t have equality because the Zis are discrete. Our rationalefor testing a group of funds is to compare its score (7) with the upper bound (8) on the expectednumber of rejected tests. If the score exceeds this bound by far, we infer that there are issues withsome of the underlying return time series. In the next section we provide an illustration with stocksand funds of hedge funds.

3.3 A reality check

As a first application of the data quality score, we would like to demonstrate the fundamentallydifferent quality of equity and Barclay funds of hedge funds returns. We chose funds of hedge fundsbecause this is one of the best categories in the Barclay database with respect to data quality.

7That is, the likelihood of rejecting a “good” return time series by chance.

Page 81: Risk Metrics 2008

A data quality score 79

Table 2

Data quality score of monthly return data

Stocks Funds ofHedge Funds

Score 0.04 0.23

P(s=0) 96.77% 85.60%

P(s=1) 2.84% 8.65%

P(s=2) 0.39% 3.61%

P(s=3) 0.00% 1.47%

P(s=4) 0.00% 0.63%

P(s=5) 0.00% 0.04%

The equity data was obtained from the Compustat database. Wetook the monthly returns of the

members of the RiskMetrics US stock universe, which is used to produce equity factors for the

RiskMetrics Equity Factor Model.8 This universe contains the ordinary shares of the largest US

companies and consists of roughly 2000 stocks. The time window was chosen such that the stock

return time series length matches the average length of the funds of hedge funds return time series.

The results in Table 2 speak for themselves and support our hypothesis that there are issues with

hedge fund data. Note that for the stocks the Bonferroni inequality (5) is “on average” respected, since

P(at least one test is rejected) = P(s> 0) = 1−P(s= 0) = 3.23%. (9)

HereP(s= 0) is the proportion of return time series with a quality score of zero, that is, with no

rejected tests. The quality score for the stocks also obeys the inequality (8). All this indicates that, as

expected, the equity data does not exhibit any of the patterns we are concerned with.

These positive findings are contrasted by the funds of hedge fund return data. Here many more funds

than predicted by the model (1) display patterns; inequality (8) is clearly breached. This leads us to

conclude that the data accuracy for the funds of hedge funds is not always given.

8See (Straumann and Garidi 2007).

Page 82: Risk Metrics 2008

80 Measuring the Quality of Hedge Fund Data

4 Analyzing a hedge fund database

This section presents an empirical study, from which many conclusions about data quality can be

drawn. The study also demonstrates the power of the previously defined data quality score and the

mechanics for using it. We explore the Barclay database witha particular view towards its quality.

4.1 The Barclay database

In this section, we describe the Barclay database and discuss the filtering rules which we have applied.

We look at the October 2007 snapshot of the complete Barclay database, which contains CTAs, fund

of funds, and hedge funds. We also consider inactive funds. Afund is called inactive if it has stopped

reporting to Barclay; note that this does not necessarily mean that it does not exist anymore. All in all,

the database contains 11,701 funds. In order that all funds contain consistent information, we apply

certain filters. These six filters, discussed next, lead to a downsize of the Barclay database to a

universe of 8574 funds.

First, we remove the 591 funds with no strategy classification. We only admit funds that report returns

net of all fees. This leads to a further exclusion of 204 funds. The next filter deals with duplicates. For

many funds there exist multiple classes, sometimes denominated in different currencies. Also there

are funds coexisting in an onshore and offshore version. We want to restrict ourselves to one class or

version only. Similarly to (Christory, Daul, and Giraud 2006), we devised an algorithm to find fund

duplicates. This algorithm is based on a comparison of Barclay’s manager identifiers, strategy

classifications, the roots of the fund names, and the investment objectives. The latter are stored in a

text field and consist of longer written summaries of the typeshown in the introduction of this paper.

All in all, we remove 1184 duplicates. The next filter removesthe 702 funds which have not reported

more than one year of monthly returns. Since we consider the assets under management (AUM) as a

very important piece of information, we next remove all 375 funds where the AUM time series is

missing.9 We mention that we have removed the leading and trailing zeros in all return time series,

since we interpret the latter as missing data. The occurrence of runs of zeros of length three or more

strictly insidethe return time series is also interpreted as missing data. We omit the corresponding

71 funds. The remaining funds have no gaps in their return series. We also mention that all funds of

the Barclay database have a monthly return frequency.

The Barclay strategy classification consists of 79 items. Wemapped these categories to the following

broad EDHEC categories: CTA, Fund of Hedge Funds, Long/Short Equity, Relative Value, Event

9We admit however gaps of missing data in the AUM time series.

Page 83: Risk Metrics 2008

Analyzing a hedge fund database 81

Table 3

Summary of Barclay hedge fund strategies

Active Inactive Total

CTA 769 1633 2402

Funds of Funds 1800 582 2382

Hedge Funds 2325 1465 3790

Long/Short Equity 1183 737 1920

Relative Value 459 404 863

Event Driven 260 167 427

Emerging Markets 321 89 410

Global Macro 102 68 170

Total 4894 3680 8574

Driven, Emerging Markets and Global Macro.10

Table 3 summarizes the number of funds in each strategy broken down by status. Barclay is known

for providing a large coverage on CTAs, and this can also be seen from the numbers. In the CTA

category there is a high proportion of inactive funds. This seems to be a database legacy artefact.

Among the funds active during the 1980s and 1990s, the CTA category clearly dominates. We

conclude that Barclay mainly focused on CTAs during these times. Moreover the CTAs exhibit a

higher attrition rate than the other categories.11 As we will see below, the CTA class contains many

“micro-funds” with less than one million dollars in AUM. It is not surprising that small funds have a

higher likelihood of disappearing than large funds. This point has also been addressed and confirmed

by (Grecu, Malkiel, and Saha 2007). We mention that many of these tiny funds are legally spoken not

CTAs because their managers do not hold an SEC licence; sincethey invest similarly to CTAs,

Barclay nevertheless categorizes them as CTAs.12

10We refer to www.edhec-risk.com/indexes for a concise description of these strategies.11The average annual attrition rates in the period 1990–2006 are: 12.3% for CTAs, 3.9% for funds of funds and 6.2% for

hedge funds.12From personal communication with the Barclay Hedge client services

Page 84: Risk Metrics 2008

82 Measuring the Quality of Hedge Fund Data

4.2 Overview of the data quality

After applying the filters, we determine the score for every fund. Tables 4 and 5 give an overview of

scores and rejection probabilities. In overall data quality, global macro and funds of funds do best and

CTAs worst.

The favorable data quality of funds of funds is not unexpected. Fund of funds managers are not

directly involved in trading activities and their role is tosome extent also administrative. It is in their

interest to have precise knowledge of the NAVs of funds they are invested in. All these factors

increase the likelihood that their reporting to the database vendors is accurate. Our result is similar to

that of (Liang 2003). In his comparison of identical funds inTASS and the US Offshore Fund

Directory, he found that among the fund of funds there were noreturn discrepancies at all.

The satisfactory quality of global macro fund data is positive. A possible explanation is that global

macro funds are active in liquid markets: currency and interest rate markets. For this reason, the

valuation of the assets is relatively straightforward for global macro funds, and this in turn should

induce a good data quality.

In contrast, we have not found a convincing explanation for the relatively bad data quality of

long-short funds. Since long-short funds trade in the equity markets, which are rather liquid, we

would have expected a better result for this category. It surprises us that the relative value and

emerging markets hedge funds, which are active onlessliquid markets, outperform the long-short

strategy in terms of data quality. We did not gain any insightinto the high score of the long-short

funds either by using the more granular strategy categorization by Barclay. In the next section, we

have a closer look at possible factors which affect data quality. But even accounting for these factors,

we have not uncovered the reasons for the poor score of long-short equity funds.

The unfavorable data quality of CTAs is caused by the many tiny funds of this group; see

Section 4.3.2 below.

For all strategies, either testT2 on the proportion of unique values or testT5 on the distribution of the

second digit after the decimal is rejected most often. We have verified that in one third of the cases

thatT2 or T5 is rejected, rounding appears to be the main cause for rejection. The next most frequent

problem concerns the occurrence of zeros (testT1). Less frequent are recurring blocks (testT4). The

occurrence of runs (testT3) is least common in the return data.

Page 85: Risk Metrics 2008

Analyzing a hedge fund database 83

Table 4

Data quality scores for Barclay funds

Score P(s=0) P(s=1) P(s=2) P(s=3) P(s=4) P(s=5)

CTA 0.45 75.19% 11.70% 7.79% 3.66% 1.62% 0.04%

Funds of Funds 0.23 85.60% 8.65% 3.61% 1.47% 0.63% 0.04%

Hedge Funds 0.35 78.92% 11.08% 6.62% 2.37% 0.98% 0.03%

Long/Short Equity 0.41 76.46% 11.46% 8.33% 2.55% 1.20% 0.00%

Relative Value 0.29 81.81% 11.01% 4.63% 1.62% 0.81% 0.12%

Event Driven 0.41 76.81% 11.71% 6.09% 4.22% 1.17% 0.00%

Emerging Markets 0.29 81.71% 10.73% 5.37% 1.71% 0.49% 0.00%

Global Macro 0.14 90.59% 6.47% 1.76% 1.18% 0.00% 0.00%

Table 5

Proportion of funds failing individual quality tests

P(T1 rej.) P(T2 rej.) P(T3 rej.) P(T4 rej.) P(T5 rej.)

CTA 6.45% 15.15% 2.33% 2.87% 18.15%

Funds of Funds 2.85% 8.73% 0.71% 1.76% 8.94%

Hedge Funds 3.72% 12.64% 0.82% 2.64% 15.67%

Long/Short Equity 3.85% 13.65% 0.47% 3.07% 19.53%

Relative Value 3.36% 11.47% 1.74% 2.32% 10.08%

Event Driven 5.85% 16.39% 1.17% 3.28% 14.52%

Emerging Markets 2.44% 10.00% 0.00% 1.71% 14.39%

Global Macro 1.76% 4.12% 1.18% 0.00% 6.47%

Page 86: Risk Metrics 2008

84 Measuring the Quality of Hedge Fund Data

Figure 1Data quality score as a function of time series length

0 20 40 60 80 100 120 140 160 180 2000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Time−Series Length

Qua

lity

Scor

e

CTAFunds of FundsHedge Funds

4.3 Predictors of data quality

In the previous discussion, we did not make use of any covariates, which would possibly help toexplain the score values. The goal of this section is to find the explanatory factors for data quality.

4.3.1 Time series length

Figure 1 displays the data quality versus the length of the return time series.13 The plot shows nicelythat the longer the time series, the higher the data quality score. This relationship is close to linear. Itis straightforward to give an explanation: the longer the return time series, the higher is theprobability that errors have occurred during its recording.

13The curves have been estimated through a binning approach. In each category, bins containing approximately 200funds with similar return time series lengths are constructed. For each bin one determines the average time series lengthtogether with the score of all funds therein, and this gives one point in the xy-plane.

Page 87: Risk Metrics 2008

Analyzing a hedge fund database 85

Table 6

Assets under management ($M) by time series length

33%- and 66%-tiles of AUM in each time series length categoryreported.

Length (yrs.) (1,3] (3,6] > 6

Percentile 33% 66% 33% 66% 33% 66%

CTA 0.9 5.0 1.5 9.3 5.9 35.6

Funds of Funds 13.8 52.6 26.1 85.3 33.9 127.4

Hedge Funds 11.3 47.5 17.6 72.7 31.1 105.5

Long/Short Equity 9.3 44.1 15.5 66.1 27.2 80.3

Relative Value 10.3 48.0 18.3 75.7 34.8 129.5

Event Driven 12.8 44.1 24.2 105.7 46.5 131.0

Emerging Markets 22.0 66.0 27.5 73.4 32.5 115.4

Global Macro 9.8 34.9 21.5 70.7 36.1 131.3

4.3.2 Assets under management (AUM)

In this section, we consider fund size, defined as the time average of the AUM series. In Table 6, we

consider the fund size in relation to time series length. From comparing the percentiles across the

three categories of time series lengths, we see that the longer the time series, the higher in general the

AUM. The CTA category contains the funds with the lowest AUM.As we alluded earlier, it is striking

how many tiny CTA funds exist. We utilize the AUM percentilesfrom Table 6 to divide the funds into

small, medium and large size categories.

We present the quality scores in Table 7. First we remark thatbucketing by time series length is

necessary in order to remove the strong effect of this factoron the data quality score. We would

expect funds with low AUM to exhibit a poorer data quality than large funds since they most likely

have fewer resources for accurate valuation and reporting.14 For CTAs, the quality improves (scores

decrease) with greater size in each of the time series lengthcategories; our conjecture holds true. For

funds of funds or hedge funds, there is no such clear relationship. We have investigated whether

auditing could play a role for this result by further subdividing the groups into audited and

non-audited funds, but this has revealed that auditing or non-auditing, respectively, is not the cause.

14We mention that (Liang 2003) established such a relationship indirectly by giving the argument that fund size and

auditing are strongly related and by establishing a positive effect of auditing on the size of the return discrepancies.

Page 88: Risk Metrics 2008

86 Measuring the Quality of Hedge Fund Data

Table 7

Data quality score by time series length and assets under management

Number of funds in each category in parentheses

Length (yrs.) (1,3] (3,6] > 6Fund Size Small Medium Large Small Medium Large Small Medium Large

CTA 0.30 (292) 0.25 (292) 0.11 (292) 0.78 (251) 0.41 (250) 0.22 (251) 0.90 (258) 0.73 (258) 0.44 (258)

Funds of Funds 0.16 (221) 0.05 (221) 0.14 (221) 0.21 (320) 0.18 (319) 0.21 (320) 0.44 (253) 0.25 (254) 0.42 (253)

Hedge Funds 0.15 (397) 0.23 (397) 0.23 (397) 0.30 (434) 0.28 (435) 0.34 (434) 0.44 (432) 0.57 (432) 0.61 (432)

Long/Short Equity 0.16 (193) 0.24 (194) 0.30 (193) 0.32 (224) 0.38 (224) 0.47 (224) 0.41 (223) 0.57 (222) 0.74 (223)

Relative Value 0.15 (95) 0.21 (94) 0.17 (95) 0.24 (107) 0.12 (106) 0.16 (107) 0.53 (86) 0.54 (87) 0.59 (86)

Event Driven 0.24 (37) 0.31 (36) 0.19 (37) 0.27 (48) 0.28 (47) 0.29 (48) 0.83 (58) 0.62 (58) 0.43 (58)

Emerging Markets 0.04 (52) 0.23 (52) 0.21 (52) 0.39 (38) 0.24 (37) 0.34 (38) 0.36 (47) 0.36 (47) 0.45 (47)

Global Macro 0.05 (20) 0.05 (21) 0.15 (20) 0.00 (18) 0.00 (19) 0.28 (18) 0.39 (18) 0.06 (18) 0.28 (18)

4.3.3 Auditing

In this section, we look at the effect of auditing on fund quality. The results are depicted in Table 8.

For the strategies CTA, fund of funds, hedge fund and long-short fund, audited funds clearly

outperform the non-audited funds in terms of data quality. For the CTAs, auditing seems to be a

particularly effective way for decreasing the data qualityscore. Note however that only a small

minority of the funds are audited. We guess that this fact is to some extent related to the size of the

CTAs, which is generally small. For relative value, event driven, emerging markets and global macro

funds, the effects of auditing seem to be mixed: audited groups do not always have a lower score than

the corresponding non-audited group.

4.3.4 Fund status

In this section, we explore differences between active and inactive funds. We expect that inactive

funds have a lower data quality than active funds. Our argument is as follows. Inactive funds are

funds that have stopped reporting, but still exist, or fundsthat were liquidated.15 In the first case, the

fund seems no longer interested in reporting. It would not besurprising if this was reflected by a high

data quality score, at least towards the end of the time series. In the second case of a liquidated fund,

15We mention that the main reason for liquidation is poor performance at the end of the fund life, as shown by (Grecu,

Malkiel, and Saha 2007).

Page 89: Risk Metrics 2008

Analyzing a hedge fund database 87

Table 8

Data quality score by audited/non-audited and time series length

Length categories as in Table 6. Number of funds in each category in parentheses

Audited Non-Audited

Length (yrs.) (1,3] (3,6] > 6 (1,3] (3,6] >6

CTA 0.08 (60) 0.25 (20) 0.40 (20) 0.23 (816) 0.48 (732) 0.70 (754)

Funds of Funds 0.10 (416) 0.15 (781) 0.34 (664) 0.15 (247) 0.39 (178) 0.59 (96)

Hedge Funds 0.19 (758) 0.30 (1021) 0.53 (1076)0.23 (433) 0.32 (282) 0.60 (220)

Long/Short Equity 0.20 (364) 0.37 (530) 0.56 (562) 0.28 (216) 0.46 (142) 0.62 (106)

Relative Value 0.18 (168) 0.18 (239) 0.57 (206) 0.16 (116) 0.15 (81) 0.49 (53)

Event Driven 0.27 (74) 0.26 (118) 0.58 (145) 0.19 (36) 0.36 (25) 0.86 (29)

Emerging Markets 0.16 (113) 0.35 (91) 0.36 (119) 0.16 (43) 0.23 (22) 0.55 (22)

Global Macro 0.03 (39) 0.12 (43) 0.20 (44) 0.18 (22) 0.00 (12) 0.40 (10)

Table 9

Data quality score by status and time series length

Length categories as in Table 6. Number of funds in each category in parentheses

Active Inactive

Length (yrs.) (1,3] (3,6] > 6 (1,3] (3,6] >6

CTA 0.11 (257) 0.19 (218) 0.46 (294)0.26 (619) 0.58 (534) 0.83 (480)

Funds of Funds 0.12 (444) 0.20 (740) 0.37 (616)0.11 (219) 0.19 (219) 0.38 (144)

Hedge Funds 0.19 (668) 0.32 (783) 0.59 (874)0.22 (523) 0.29 (520) 0.44 (422)

Long/Short Equity 0.22 (325) 0.44 (410) 0.62 (448)0.25 (255) 0.32 (262) 0.48 (220)

Relative Value 0.17 (122) 0.13 (176) 0.66 (161)0.18 (162) 0.23 (144) 0.39 (98)

Event Driven 0.24 (59) 0.28 (76) 0.66 (125) 0.25 (51) 0.28 (67) 0.55 (49)

Emerging Markets 0.15 (129) 0.29 (90) 0.43 (102)0.22 (27) 0.48 (23) 0.28 (39)

Global Macro 0.06 (33) 0.06 (31) 0.26 (38) 0.11 (28) 0.13 (24) 0.19 (16)

Page 90: Risk Metrics 2008

88 Measuring the Quality of Hedge Fund Data

the argument is similar. It is likely that a fund does not concentrate anymore on accurately reportingclose before it liquidates. We convince ourselves from Table 9 that this conjecture is true for CTAsonly. For funds of funds, hedge funds and all subcategories there is no striking relationship betweendata quality score and fund status.

4.3.5 Concluding remarks concerning predictors of data quality

Summarizing, we have found that the time series length and whether a fund is audited or not are themost important predictors for the data quality score. For the other tested predictors, there are noconclusive results which would hold across all fund categories. AUM and fund status have somepredictive power for the class of CTA funds. Small values of AUM have a clearly negative impact onthe data quality of CTAs. Inactive CTAs have a generally lower data quality than active CTAs.

We have presented an exploratory analysis of the predictive power of certain factors for data quality.Additional factors could have been tested. An alternative approach would have been to fit somegeneralized linear model to the data quality scores. The advantage of a model-based analysis wouldbe the straightforward and mechanical assessment of the significance of factors, basically by lookingat p-values of estimated model parameters. The disadvantage is that we would have to trust a blackbox. Since the data quality score is a new concept and since we wanted to gain a certain intuitionabout it, we have preferred performing an exploratory data analysis, which consists of looking attables and plots.

4.4 Is there an improvement of data quality over time?

This section is concerned with the evolution of data quality through time. To this end, we look at twoequally long time periods: 1997–2001 and 2002–2006. For each time period, we consider those fundsthat reported returns during the full period. We calculate the data quality score for the time seriesrestricted to the corresponding time-period. Note that this leads to some simplification of the analysisby virtue of the fact that all funds have equally long time series consisting of 60 monthly returns. Thedivision into small, medium and large funds is as discussed in the previous sections.

The results of Table 10 indicate that there is a clear improvement of the quality for CTAs. Indeed, forall fund size groups the score of the second period is lower than for the first period. For all otherstrategies, the relationship is mixed. For funds of funds and hedge funds, the data quality stays moreor less at the same level. Most striking we found the considerable decrease of quality for largelong-short funds. At the time being we do not have an explanation for this result.

Page 91: Risk Metrics 2008

Analyzing a hedge fund database 89

Table 10

Evolution of data quality score through time

Fund Size Small Medium Large

Time Period 1997-2001 2002-2006 1997-2001 2002-2006 1997-2001 2002-2006

CTA 0.43 (294) 0.33 (375) 0.24 (295) 0.17 (376) 0.28 (294) 0.11 (375)

Funds of Funds 0.21 (191) 0.20 (716) 0.09 (191) 0.14 (717) 0.14 (191) 0.20 (716)

Hedge Funds 0.23 (462) 0.22 (1080) 0.26 (462) 0.27 (1080) 0.30 (462) 0.34 (1080)

Long/Short Equity 0.25 (240) 0.22 (539) 0.26 (239) 0.32 (539) 0.35 (240) 0.44 (539)

Relative Value 0.16 (95) 0.22 (255) 0.27 (94) 0.17 (255) 0.28 (95) 0.19 (255)

Event Driven 0.43 (63) 0.25 (122) 0.37 (62) 0.26 (121) 0.13 (63) 0.34 (122)

Emerging Markets 0.17 (48) 0.27 (116) 0.19 (47) 0.23 (115) 0.15 (48) 0.37 (116)

Global Macro 0.12 (17) 0.10 (49) 0.44 (18) 0.21 (48) 0.00 (17) 0.14 (49)

Summarizing the results, it is all in all fair saying that thedata quality in the Barclay databasehas

improved. Possible reasons for this improvement could be the general increase of transparency in the

hedge fund world during the past decade and the advances of information technology, which generally

facilitated the collection and management of large amountsof data.

4.5 The data quality bias

We finally estimate the data quality bias, as announced earlier in this paper. The data quality bias

measures the impact of imperfect data on performance. We adapt the common definition of data

biases to the case of data quality. The data quality bias is defined as the average annual performance

difference of the entire universe of funds and the group of funds with data quality score equal to zero.

Following the practice of the hedge fund literature, performances of funds are averaged using equal

weights. A positive data quality bias indicates that the inclusion of funds with imperfect return data

leads to an overstatement of the performance, and vice versa. The data quality bias for a certain

strategy is obtained by restricting the universe to this strategy. Also recall that the survivorship bias is

the average annual performance difference of the living funds and the entire universe of funds.

The results are presented in Table 11. We mention that we havetaken into account the non-USD

currencies when calculating the performances. Numbers indicate that the data quality could be a

non-negligible source for performance misrepresentation.

Page 92: Risk Metrics 2008

90 Measuring the Quality of Hedge Fund Data

Table 11

Data quality bias and survivorship bias 1997-2006 (annualized)

Data Quality Bias(%) Survivorship Bias(%)

CTA 0.16 1.75

Funds of Funds -0.14 0.07

Hedge Funds 0.48 0.67

Long/Short Equity 0.64 0.60

Relative Value -0.14 0.73

Event Driven 1.50 -0.17

Emerging Markets -0.01 0.84

Global Macro 0.24 1.93

Both the data quality and the survivorship bias are almost negligible for the funds of funds. For CTAs,

the data quality bias is small compared to the huge survivorship bias; note that the large survivorship

bias for CTAs is to a large extent due to their high attrition rate. Recall that the data quality of CTAs is

generally low; nevertheless the data quality bias is not outrageous. For hedge funds the data quality

bias is in the order of magnitude of the survivorship bias. Concluding this section, we would like to

stress the point that data biases are of course not additive.

4.6 Regularizing hedge fund return data

As a last piece of the analysis of the Barclay database, we would like to study the uses of the quality

score for “cleaning” hedge fund data. We appeal to the previously given overview on the hedge fund

return anomalies literature, where we cited the work of (Bollen and Pool 2007). These authors found

a significant discontinuity at zero for the pooled distribution of hedge fund returns reported to the

CISDM database. First we would like to verify whether a similar observation can be made when

using the Barclay return data. We apply a kernel density estimator to percentage returns. Thereby we

use a Gaussian kernel together with a bandwidth equal to 0.025. The results in Figure 2 are quite

illustrative. For the left-hand plot, the estimator is applied to all 8574 funds in the Barclay database,

whereas for the right-hand plot the 1738 funds with data quality issues are removed.

Note that the kernel density estimate for the raw Barclay return data in the right-hand plot is very

wiggly. This wiggling is induced by those funds which have heavily rounded returns. The next

Page 93: Risk Metrics 2008

Conclusions 91

Figure 2Kernel density estimates for pooled distribution of Barclay fund returnsAll funds (left) and funds with quality score of zero (right)

−2 −1 0 1 20

0.05

0.1

0.15

0.2

0.25

return (%)

dens

ity

Barclay raw

−2 −1 0 1 20

0.05

0.1

0.15

0.2

0.25

return (%)

dens

ity

Barclay cleaned

observation is the pronounced jump of the density at zero, which is in line with (Bollen and Pool2007). Before we move to the right-hand plot, we stress that for both plots the kernel densityestimates are based on the same bandwidth and evaluated at identical grid-points. In the right-handplot, the wiggling almost disappears. We are not surprised by this because the tests T1-T5 reject timeseries with heavily rounded returns. Also note that the density curve is still very steep at zero;however, the jump size appears to be slightly less pronounced. This example shows that removingfunds with a nonzero data quality score can to some extent regularize hedge fund return data.

5 Conclusions

In this paper, we have provided a comprehensive discussion of quality issues in hedge fund returndata. Hedge funds data quality is a topic which is often avoided, maybe because it is perceived as notparticularly fruitful or just boring. The main goal of this paper was to increase the awareness forirregularities and patterns in hedge fund return data, to suggest methods for finding them and toquantify their severity.

Page 94: Risk Metrics 2008

92 Measuring the Quality of Hedge Fund Data

Using a simple, natural and mathematically sound rationale, we introduced tests and devised a novel

scoring method for quantifying the quality of return time series data. By means of an empirical study

using the Barclay database, we then demonstrated how such a score can be used for exploring

databases. Our findings conformed to a large extent with results from other articles. This can be seen

as a partial validation of the score approach.

While the score approach is appealing and can be applied in analmost mechanical fashion, it seems to

us that uncritically computing data quality scores could prove harmful. Most hedge fund databases

have grown organically, and every analysis must respect that there are legacy issues. It would be very

wrong to ascribe excessive importance to numerical score values without looking at the underlying

causes.

References

Agarwal, V., N. Daniel, and N. Naik (2006). Why is Santa Clausso kind to hedge funds? The

December return puzzle! Working paper, Georgia State University.

Bollen, N. and V. Pool (2006). Conditional return smoothingin the hedge funds industry.

Forthcoming,Journal of Financial and Quantitative Analysis.

Bollen, N. and V. Pool (2007). Do hedge fund managers misreport returns? Working paper,

Vanderbilt University.

Brown, S., W. Goetzmann, and J. Park (2001). Careers and survival: competition and risk in the

hedge fund and CTA industry.Journal of Finance 56(5), 1869–1886.

Christory, C., S. Daul, and J.-R. Giraud (2006). Quantification of hedge fund default risk.Journal

of Alternative Investments 9(2), 71–86.

Edhec-Risk Asset Management Research (2006). EDHEC investable hedge fund indices. Available

at http://www.edhec-risk.com.

Eling, M. (2007). Does hedge fund performance persist? Overview and new empirical evidence.

Working paper, University of St. Gallen.

Getmansky, M., A. Lo, and I. Makarov (2004). An econometric model of serical correlation and

illiquidity in hedge fund returns.Journal of Financial Economics 74, 529–609.

Grecu, A., B. Malkiel, and A. Saha (2007). Why do hedge funds stop reporting their performance?

Journal of Portfolio Management 34(1), 119–126.

Page 95: Risk Metrics 2008

Conclusions 93

Hasanhodzic, J. and A. Lo (2007). Can hedge-fund returns be replicated? The linear case.Journal

of Investment Management 5(2), 5–45.

Karr, A., A. Sanil, and D. Banks (2006). Data quality: a statistical perspective.Statistical

Methodology 3, 137–173.

Lhabitant, F.-S. (2004).Hedge Funds: Quantitative Insights. Chichester: John Wiley & Sons.

Liang, B. (2000). Hedge funds: the living and the dead.Journal of Financial and Quantitative

Analysis 35, 309–326.

Liang, B. (2003). Hedge fund returns: auditing and accuracy. JOPM 29(Spring), 111–122.

Straumann, D. and T. Garidi (Winter 2007). Developing an equity factor model for risk.

RiskMetrics Journal 7(1), 89–128.

Page 96: Risk Metrics 2008

94 Measuring the Quality of Hedge Fund Data

Page 97: Risk Metrics 2008

Capturing Risks of Non-transparent Hedge Funds

Stephane Daul∗

RiskMetrics [email protected]

We present a model that captures risks of hedge funds only using their historical performance asinput. This statistical model is a multivariate distribution where the marginals derive from anAR(1)/AGARCH(1,1) process witht5 innovations, and the dependency is a grouped-t copula. Theprocess captures all relevant static and dynamic characteristics of hedge fund returns, while thecopula enables us to go beyond linear correlation and capture strategy-specific tail dependency.We show how to estimate parameters and then successfully backtest our model and some peermodels using 600+ hedge funds.

1 Introduction

Investors taking stakes in hedge funds usually do not get full transparency of the funds’ exposures.

Hence in order to perform their monitoring function, investors would benefit from models based only

on hedge fund past performances.

The first type of models consists of linear factor decompositions. These are potentially very powerful,

but no clear results have emerged and intensive research is ongoing. We present here a less ambitious

but successful second approach based on statistical processes. We are able to accurately forecast the

risk taken by one or more hedge funds only using their past track record.

In this article we first describe the static and dynamic characteristics of hedge fund returns that we

wish to capture. Then we introduce an extension of the usual GARCH process and detail its

parametrization. This model encompasses other standard processes, enabling us to backtest all of the

models consistently. Finally we study the dependence of hedge funds, going beyond linear correlation

to introduce tail dependency.

∗The author would like to thank G. Zumbach for helpful discussion.

Page 98: Risk Metrics 2008

96 Capturing Risks of Non-transparent Hedge Funds

2 Characterizing hedge fund returns

We start by presenting descriptive statistics of hedge funds returns. To that end, we use the

information from the HFR database. This database consists of (primarily monthly) historical returns

for hedge funds. We assume that what is reported for each hedge fund is the monthly return at timet

(measured in months) defined as

rt =NAV t −NAV t−1

NAV t−1, (1)

where NAVt is the net asset value of the hedge fund at timet. This return is considered net of all

hedge fund fees. We consider only the 680 hedge funds with more than 10 years of data (i.e. at least

120 observations). This will enable us to perform extensiveout-of-sample backtesting afterwards.

We first analyze the shape of the distribution of the monthly returns. The classical tests for normality

are the Jarque-Bera and Lilliefor tests. At a 95% confidence level both tests reject the normality

hypothesis on a vast majority of hedge funds: out of the 680 hedge funds, the Jarque-Bera test rejects

598, and the Lilliefor test rejects 498.

A common assertion about hedge fund returns is that they are skewed. By looking at the sample

skewness, this is certainly what we would conclude. Howeverthis quantity is sensitive to outliers and

not a robust statistic. We opt for testing the symmetry usingthe Wilcoxon signed rank sum,

W =N

∑i=1

φiRi , (2)

whereRi is the rank of the absolute values,φi is the sign of samplei andN is the number of samples.

This test rejects only 26 of the 680 hedge funds at a significance level of 95%. We conclude that the

bulk of the hedge funds do not display asymmetric returns, but that tail events and small sample size

are produce high sample skewness.

After describing the static behavior of hedge fund returns,we analyze their dynamics by calculating

various one-month lagged correlation coefficients. We consider the following correlation coefficients:

• Return-lagged returnρ(rt , rt−1),

• Volatility-lagged volatilityρ(σt ,σt−1) and

• Volatility-lagged returnρ(σt , rt−1).

If the time series have no dynamics (such as white noise) thenthe autocorrelation coefficients should

follow a normal distribution with variance1N . Hence a coefficient is significant at 95% if it falls

Page 99: Risk Metrics 2008

Characterizing hedge fund returns 97

Figure 1

Distributions of one-month lagged correlation coefficients across hedge funds

Only significantly non-zero values reported.

−1 −0.5 0 0.5 10

20

40

60

80

100Return − Lagged Return

−1 −0.5 0 0.5 10

20

40

60

80

100Volat − Lagged Volat

−1 −0.5 0 0.5 10

10

20

30

40

50

60Volat − Lagged Return

outside[− 2√

N, 2√

N

]. Using the 680 hedge funds, we found 336 significant coefficients for the

return-lagged return correlation, 348 for the volatility-lagged volatility correlation and 192 for the

volatility-lagged return correlation.

We summarize the significant coefficients in histograms in Figure 1. The first coefficient is the

correlation between the return and the lagged return. The coefficients are essentially positive,

meaning that some of one month’s return is transferred to thenext month. This could have different

origins: valuation issues, trading strategies or even smoothing. The second coefficient is the

correlation between the volatility and the lagged volatility. These are essentially positive and imply

heteroscedasticity, or non-constant volatility. Finally, the third coefficient is the correlation between

the volatility and the lagged return. These coefficients areboth positive and negative, suggesting that

hedge fund managers adapt their strategies (increasing or decreasing the risk taken) to upwards or

downwards markets.

We also examined the dependency between the three coefficients, but did not observe any structure.

We conclude that the three dynamic characteristics are distinct, and must be captured separately by

our model. To summarize, we have found that hedge fund returns are non-normal but not necessarily

skewed, and that they have at least three distinct dynamic properties.

Page 100: Risk Metrics 2008

98 Capturing Risks of Non-transparent Hedge Funds

3 The univariate model

3.1 The process

To describe the univariate process of hedge fund returns, westart with a GARCH(1,1) model to

capture heteroscedasticity. We then extend the model by introducing autocorrelation in the returns and

an asymmetric volatility response. The innovations are also generalized by using an asymmetric

t-distribution.

The process is

rt+1 = r +α(rt − r)+σtεt , (3)

σ2t = (ω∞ −α2)σ2

∞ +(1−ω∞)σ2t , (4)

σ2t = µσ2

t−1 +(1−µ) [1−λ sign(rt)](rt − r)2. (5)

The parameters of the model are thus

r,α,ω∞,σ∞,µ,λ , (6)

and the distribution shape of the innovationsεt .

When ¯r,α,λ = 0, the model reduces to the standard GARCH(1,1) model written is a different way

(Zumbach 2004). In this form, the GARCH(1,1) process appears with the elementary forecastσt

given by a convex combination of the long-term volatilityσ∞ and the historical volatilityσt . The

historical volatilityσt is measured by an exponential moving average (EMA) at the time horizon

τ = −1/ log(µ). The parameterω∞ ranges from 0 to 1 and can be interpreted as the volatility of the

volatility.

The parameterα ∈ [−1,1] induces autocorrelation in the returns. The parameterλ ∈ [−1,1] yields

positive (negative) correlation between the lagged returnand the volatility ifλ is negative (positive).

The innovationsεt are i.i.d. random variables withE[εt] = 0 andE[ε2t ] = 1. We choose an asymmetric

generalization of thet-distributionintroduced by Hansen (Hansen 1994). The parameters areλ′ andν,

and the density is given by

gλ′,ν(ε) =

bc

(1+ 1

ν−2

(bε+a1−λ′

)2)−(ν+1)/2

ε ≤−a/b,

bc

(1+ 1

ν−2

(bε+a1+λ′

)2)−(ν+1)/2

ε > −a/b,

(7)

Page 101: Risk Metrics 2008

The univariate model 99

with

a = 4λ′c(

ν−2ν−1

), (8)

b2 = 1+3λ′2−a2, (9)

c =1√

π(ν−2)

Γ((ν+1)/2)

Γ(ν/2). (10)

For λ′ = 0 this distribution reduces to the usualt-distributionwithν degrees of freedom; forλ′ > 0, it

is right skewed and forλ′ < 0, it is negatively skewed.

3.2 Parametrization

The choice of our parametrization enables us to separate thedifferent parameter estimations. First, we

set

α = ρ(rt, rt−1). (11)

For λ = 0, this implies

E[(rt − r)2] = σ2∞ (12)

justifying the estimation ofσ∞ by the sample standard deviation. The expected return ¯r is set to the

historical average return.

In order to reduce overfitting, we set some parameters to fixedvalues. We make the hypothesis that

the volatility dynamics and the tails of the innovations areuniversal, implying fixed values forω∞, µ

andν.

We obtainω∞ andµ by analyzing the correlation function for the pure GARCH case (that is,λ = 0

andα = 0). We rewrite the process, introducing

β0 = σ2∞(1−µ)ω∞, (13)

β1 = (1−ω∞)(1−µ), (14)

β2 = µ. (15)

Assuming an average return ¯r = 0, the process becomes

rt = σtεt , (16)

σ2t = β0+β1r2

t +β2σ2t−1. (17)

Page 102: Risk Metrics 2008

100 Capturing Risks of Non-transparent Hedge Funds

Figure 2

Tail distribution of the innovations

−1 −0.5 0 0.5 1 1.5 2−16

−14

−12

−10

−8

−6

−4

−2

0

log(−ε)

log(

cdf)

residuals t

5

normal

The autocorrelation function forr2t decays geomatrically1

ρk = ρ(r2t , r

2t−k) = ρ1(β1+β2)

k−1, (18)

with

ρ1 =

(β1+

β21β2

1−2β1β2−β22

.

). (19)

We evaluate the sample autocorrelation function

ρk = ρ(r2t , r

2t−k), (20)

across the 680 hedge funds. We then fit the cross-sectional averageρk to a power law decay as (18).

Finally, we transform the estimated parameters back to our parametrization, yieldingω∞ = 0.55 and

µ= 0.85.

To estimate the tail parameterν of the innovations, we compute the realized innovations, setting

λ = 0, ω∞ = 0.55 andµ= 0.85. Since we hypothesize that the innovation distribution is universal, we

aggregate all realized innovations and plot the tail distribution. We see in Figure 2 that a value of

ν = 5 is the optimal choice.

Finally, the remaining parametersλ andλ′ are obtained for each hedge fund using maximum

likelihood estimation. Table 1 recapitulates all parameters and their estimation.

1See (Ding and Granger 1996).

Page 103: Risk Metrics 2008

Backtest 101

Table 1

Parameter estimation

Parameter Effect captured Value

r expected return individual mean(r)

α autocorrelation individual ρ(rt, rt−1)

ω∞ volatility of volatility universal 0.55

σ∞ long term volatility individual std(r)

µ EMA decay factor universal 0.85

λ dynamic asymmetry individual MLE

ν innovation tails universal 5

λ′ innovation asymmetry individual MLE

4 Backtest

We follow the framework set in (Zumbach 2007). The process introduced in Section 3.1 yields us a

forecast at timet for the next month’s return

rt+1 = r +α(rt − r), (21)

and a forecast of the volatilityσt . At time t +1, we know the realized returnrt+1 and can evaluate the

realized residual

εt =rt+1− rt+1

σt. (22)

Next, we calculate the probtile

zt = t5(εt) , (23)

wheret5(x) is the cumulative distribution function of the innovations. These probtiles should be

uniformly distributed through time and across hedge funds.To quantify the quality of our model, we

calculate the relative exceedance

δ(z) = cdfemp(z)−z, (24)

where cdfemp is the empirical distribution function for the probtiles and introduce the distance

d =

∫ 1

0dz|δ(z)|. (25)

We have calculated such distance across all times and all hedge funds, and report the results in Figure

3. The first result (labeled “AR(0) normal”) is the usual normal distribution with no dynamics. Then

Page 104: Risk Metrics 2008

102 Capturing Risks of Non-transparent Hedge Funds

Figure 3Average distance d across all hedge funds

0 0.02 0.04 0.06 0.08

AR(1) AGARCH asym. t

AR(1) AGARCH t

AR(1) GARCH t

AR(1) GARCH normal

AR(1) normal

AR(0) normal

from top to bottom, we add successively autocorrelation, heteroscedasticity with normal innovations,heteroscedasticity with t5 innovations, asymmetric response in the dynamic volatility and finallyasymmetry in the innovations. We see that compared to the usual static normal distribution, the bestmodel reduces more than three times the distance between the realized residuals and the modeledresiduals. The two major improvements are when we introduce heteroscedasticity and fat tails in theinnovations. The last step (adding innovation asymmetry) does not improve the results, as we mighthave suspected from the earlier Wilcoxon tests, and further, induces over-fitting.

5 The multivariate extension

Let us consider now the case of N hedge funds simultaneously, as in a fund of hedge funds. We haveseen in Section 4 that the appropriate univariate model is the AR(1) plus AGARCH plus t5-distributedinnovations. We now consider multivariate innovations where the marginals are t5-distributions andthe dependency is modeled by a copula.

This structure enables us to capture tail dependency, which is not possible with linear correlationalone but present within hedge funds. Figure 4 presents an illustrative extreme example of two hedgefund return time series. We see that most of the time the two hedge funds behave differently, while inone month they both experience tail events. These joint tail events are a display of tail dependency.

Page 105: Risk Metrics 2008

The multivariate extension 103

Figure 4

Tail dependency in hedge fund returns

31−Jan−85 31−Oct−87 31−Jan−00−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

The multivariate distribution of theN innovations is given by

F(ε1, . . . ,εN) = U (t5(ε1), . . . , t5(εN)) (26)

whereU(u1, . . . ,uN)—a multivariate uniform distribution—is the copula to estimate.

5.1 Copula and tail dependency

Consider two random variablesX andY with marginal distributionsFX andFY. The upper tail

dependency is

λu = limq→1

P[X > F−1

X (q)|Y > F−1Y (q)

], (27)

and analogously the lower tail dependency is

λ` = limq→0

P[X ≤ F−1

X (q)|Y ≤ F−1Y (q)

]. (28)

These coefficients do not depend on the marginal distribution of X andY, but only on their copula.

See (Nelsen 1999) for more details.

The probtiles (defined in Section 4) of theN hedge funds observed at timet

(z1t , . . . ,z

Nt ) = UN

t (29)

Page 106: Risk Metrics 2008

104 Capturing Risks of Non-transparent Hedge Funds

Figure 5

Upper and lower tail dependency as a function ofq, fixed income arbitrage hedge funds

0 0.01 0.02 0.03 0.04 0.05−0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

constitute a realization of the copulaUNt . Since our univariate model has extracted all of the process

dynamics, we are free to reorder and aggregate our observations across time. We make the further

assumption that within a strategy, all pairs of hedge funds have the same dependence structure. Thus,

we may interpret each observation of innovations for a pair of hedge funds in a strategy at a given

time as a realization from a universal (for the strategy) bivariate copula. So fromM historical periods

onN hedge funds, we extractMN(N−1)/2 realizations.

From this bivariate sample we can infer the upper and lower tail dependency nonparametrically using

Equations (27) and (28). We calculate the coefficient for fixed values ofq as shown if Figure 5 and

extrapolate the value for the limiting case.

We can also obtain a parametric estimation of the tail dependency by fitting the realized copula

between two hedge funds to at-copula. The parameters of such copula are the correlation matrix ρ(which we estimate using Kendall’sτ) and the degrees of freedomνcop (which we estimate using

maximum likelihood). The tail dependency is symmetric and is obtained by2

λ` = λu = 2−2tνcop+1

(√νcop+1

√1+ρ12√1−ρ12

), (30)

whereρ12 is the correlation coefficient between the two hedge funds.

2(Embrechts, McNeil, and Straumann 2002)

Page 107: Risk Metrics 2008

The multivariate extension 105

Table 2

Estimated tail dependency coefficients

Empirical Empirical

Strategy N Lower Upper λ±σλ νcop

Convertible Arbitrage 16 0.2 0.1 0.18± 0.09 6

Distressed Securities 18 0.06 0.05 0.05± 0.09 10

Emerging Markets 29 0 0 0.07± 0.06 8

Equity Hedge 103 0.05 0 0.04± 0.05 10

Equity Market Neutral 16 0.04 0.04 0.02± 0.03 9

Equity Non-Hedge 32 0.1 0 0.17± 0.06 5

Event-Driven 38 0.17 0 0.11± 0.08 7

Fixed Income 28 0.09 0 0.03± 0.07 9

Foreign Exchange 14 0 0.1 0.03± 0.09 10

Macro 27 0 0.05 0.03± 0.08 10

Managed Futures 58 0 0.07 0.05± 0.07 9

Merger Arbitrage 10 0 0.15 0.20± 0.17 5

Relative Value Arbitrage 20 0.1 0 0.04± 0.09 10

Short Selling 7 0 0 0.50± 0.22 3

Table 2 shows the results for all strategies. We report the nonparametric lower and upper coefficients,

as well as the results of the parametric estimation. Since inthe parametric case, the coefficient

depends on the correlation between each pair of hedge funds,we report the average and its standard

deviation across fund pairs. We also show the estimated degrees of freedom of the copulaνcop. We

see that the two estimates of tail dependence are consistent.

5.2 The multivariate model

To capture the different tail dependencies within each strategy we use a generalization of thet-copula,

namely the grouped-t copula (Daul, DeGiorgi, Lindskog, and McNeil 2003). We firstpartition theN

hedge funds inm groups (strategies) labeledk, with dimensionsk and parameterνk.

Then letZ be a random vector following a multivariate normal distribution of dimensionN with linear

Page 108: Risk Metrics 2008

106 Capturing Risks of Non-transparent Hedge Funds

correlation matrixρ and letGν be the distribution function of√

νχ2

ν, (31)

whereχ2ν follows a chi-square distribution withν degree of freedom. IntroducingU , a uniformly

distributed random variable independent ofZ, we define

Rk = G−1νk

(U), (32)

and

Y =

R1

Z1

. . .

Zs1

R2

Zs1+1

. . .

Zs2

. . .

. (33)

As a result, for instance, the group of random variables(Y1, . . . ,Ys1) has as1-dimensional multivariate

t-distributionwithν1 degrees of freedom.

Finally using the univariate distribution function of the innovations, we get a random vector of

innovations, [t−15 (tν1(Y1)) , . . . , t−1

5 (tνk(YN))]

(34)

following a meta grouped-t distribution with linear correlation matrixρ and with different tail

dependency in each group (strategy). The tail dependenciesare captured by theνk’s and are different

in general from the degree of freedomν = 5 of the innovations.

6 Conclusion

We have presented a model that captures all the static, dynamic and dependency characteristics of

hedge fund returns. Individual hedge fund returns are non-normally distributed, and show

autocorrelation and heteroscedasticity. Their volatility adapts when the hedge fund manager under- or

outperforms. Concerning multiple hedge funds, we have looked at joint events and noticed that tail

dependency is present.

Our model consists of a univariate process and a copula structure on the innovations of that process.

The univariate process is an asymmetric generalization of aGARCH(1,1) process while the

Page 109: Risk Metrics 2008

Conclusion 107

dependency is captured by a grouped-t copula with different tail dependency for each strategy. This

model shows compelling out-of-sample backtesting results.

This approach can be applied to any hedge fund and in particular to non-transparent ones. Using only

hedge fund historical performance we may forecast the risk of a portfolio of hedge funds. A

straightforward model extension permits analysis of portfolios of hedge funds mixed with other asset

classes.

References

Daul, S., E. DeGiorgi, F. Lindskog, and A. McNeil (2003). Thegroupedt-copula with an

application to credit risk.Risk 16, 73–76.

Ding, Z. and C. W. J. Granger (1996). Modeling volatility persistence of speculative returns: A

new approach.Journal of Econometrics(73), 185–215.

Embrechts, P., A. McNeil, and D. Straumann (2002). Correlation and dependence in risk

management: Properties and pitfalls. In M. Dempster (Ed.),Risk Management: Value-at-Risk

and Beyond, pp. 176–223. Cambridge University Press.

Hansen, B. E. (1994). Autoregressive conditional density estimation.International Economic

Review 3(35), 705–730.

Nelsen, R. (1999).An introduction to Copulas. Springer, New York.

Zumbach, G. (2004). Volatility processes and volatility forecast with long memory.Quantitative

Finance 4, 70–86.

Zumbach, G. (2007). Backtesting risk methodologies from one day to one year.RiskMetrics

Journal 7(1), 17–60.

Page 110: Risk Metrics 2008

www.riskmetrics.com

Page 111: Risk Metrics 2008

RiskMetrics JournalVolume8, Number 1 Winter 2008

3 Volatility Forecasts and At-the-MoneyImplied Volatility

25 In5ation Risk Across the Board

51 Extensions of the Merger Arbitrage Risk Model

65 Measuring the Quality of Hedge Fund Data

96 Capturing Risks of Non-transparent Hedge Funds