49
CONDITIONAL CORRELATION Nick Wade Northfield Information Services Asian Research Seminar November 2009

Conditional Correlation 2009

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Conditional Correlation 2009

CONDITIONAL CORRELATION

Nick WadeNorthfield Information ServicesAsian Research SeminarNovember 2009

Page 2: Conditional Correlation 2009

OVERVIEW

Correlation holds a pivotal place in our analysis of data, and the construction of forecasting models for return and risk

Review the literature on correlation stability with a particular focus on turbulent markets

Backtrack: Review assumptions underlying correlation Explore role in regression, factor analysis, and cluster analysis Discuss alternate distance measures and adjustments Show recent advances in regression, factor analysis, and cluster

analysis that avoid non-stationarity issues Return to the thread:

Show how a simple regime-switching model allowing conditional correlation can be incorporated into the linear multifactor risk model

Page 3: Conditional Correlation 2009

CORRELATION BREAKDOWN

The concept of a “correlation breakdown” describing the tendency of correlations to move towards 1 as markets melt down reducing all the benefit of diversification just when it’s needed the most.

It would also have stark implications for the mutual fund industry, whose business model is at least in part based on providing access to diversification

Page 4: Conditional Correlation 2009

TO OPEN…

Increasing attention is being paid to the issue of correlations varying over time: (stocks) De Santis, G. and B. Gerard (1997), International asset pricing and portfolio

diversification with time-varying risk, Journal of Finance, 52, 1881-1912. (stocks) Longin, F. and B. Solnik (2001), Extreme correlation of international equity

markets, Journal of Finance, LVI(2), 646-676. (bonds) Hunter, D.M. and D.P. Simon (2005), A conditional assessment of the

relationships between the major world bond markets, European Financial Management, 11(4), 463-482.

(bonds) Solnik, B., C. Boucrelle and Y.L. Fur (1996), International market correlation and volatility, Financial Analysts Journal, 52(5), 17-34.

Markov switching model: Chesnay, F and Jondeau, E “Does Correlation Between Stock Returns really increase during turbulent periods?” Bank of France research paper.

To date little explored – however, Implied Correlation also seems useful, and more powerful than historical correlation in forecasting (we saw the same result with volatility):

Campa, J.M. and P.H.K. Chang (1998), The forecasting ability of correlations implied in foreign exchange options, Journal of International Money and Finance, 17, 855-880.

Page 5: Conditional Correlation 2009

CORRELATION STABILITY

One of the first… Kaplanis (1988): STABLE Tang (1995), Ratner (1992), Sheedy (1997): STABLE –

although crash of 1987 regarded as an “anomaly” Bertero and Mayer (1989), King and Wadwhani (1990)

and Lee and Kim (1993): correlation has increased, but STABLE

Not quite so stable? Erb et al (1994) – increases in bear markets

Longin and Solnick (1995) – increases in periods of high volatility

Longin and Solnick (2001) – increases in bear markets

Page 6: Conditional Correlation 2009

TURBULENCE IN THE MARKET

Kritzman (2009): Correlation of US and foreign stocks when

both markets’ returns are one standard deviation above their mean: -17%

Correlation of US and foreign stocks when both markets’ returns are one standard deviation below their mean: +76%

“Conditional correlations are essential for constructing properly diversified portfolios”

Page 7: Conditional Correlation 2009

CORRELATION REGIMES

If it’s not stable, how about Markov switching models? Ramchand and Susmel (1998), Chesnay

and Jondreau (2001) – correlation, conditioned on market regime, increases in periods with high volatility

Ang and Bekaert (1999) – evidence for two regimes; a high vol/high corr, and a low vol/low corr.

Page 8: Conditional Correlation 2009

BACKTRACK

Before we get too far down the track, lets look at the assumptions underlying correlation and see what the implications are

Page 9: Conditional Correlation 2009

CORRELATION – A DEFINITION

Correlation measures the strength and direction of a linear relationship between two variables

Difference from the mean in the numerator Standard deviation in the denominator Not new… dates from the 1880’s…

Page 10: Conditional Correlation 2009

CORRELATION – LIMITATIONS & ASSUMPTIONS

Linear – fits only the linear part of the relationship Stationary – susceptible to trend in mean in either

series Stationary – assumes the volatility of each series is

unchanging over time No concept of higher moments Sensitive to the underlying distribution Sensitive to the presence of estimation error in the

observations An incomplete measure of the relationship between

two series

Page 11: Conditional Correlation 2009

CORRELATION IN REGRESSION

Stating the obvious, but… Correlation is at the heart of regression analysis:

Remember to use something like Durbin-Watson to test for serial correlation Typical situation: huge R^2, low DW (positive s.c.) If DW<1 problematic positive serial correlation

If you think the residuals are non-Normal, DW blows up. Use Breusch-Godfrey test instead.

Page 12: Conditional Correlation 2009

CORRELATION IN FACTOR ANALYSIS

(Principal Components Analysis is a sub-class of the family of Factor Analysis techniques)

Correlation is a key part of factor analysis PCA uses the eigenvectors of the

covariance matrix, and hence is affected by anything that impacts the volatility or correlation of the series

Page 13: Conditional Correlation 2009

CORRELATION IN CLUSTER ANALYSIS

Cluster analysis is closely related to factor analysis

Cluster analysis assigns members to a group depending on rules and a distance measure For example “complete linkage” cluster analysis

adds a new member to the group whose least-related member is most highly related to the new member.

Thus, depending on the choice of distance measure [correlation is the most common choice], cluster analysis may also be affected by any problems with correlation…

Page 14: Conditional Correlation 2009

ONE CLASSIFICATION SYSTEM

Jorge Luis Borges, “Other Inquisitions” 1937-1952

Animals are divided into: a)those that belong to the Emperor, b) embalmed ones, c) those that are trained, d) suckling pigs, e) mermaids, f) fabulous ones, g) stray dogs, h) those that are included in this classification, i) those that tremble as if they were mad, j) innumerable ones, k) those drawn with a very fine camel's hair brush, l) others, m) those that have just broken a flower vase, n) those that resemble flies from a distance.

Page 15: Conditional Correlation 2009

THOUGHTS

Non-stationary volatility (ARCH, GARCH, etc) We spend an heroic amount of time trying to forecast non-

stationary volatility But we often just ignore it when we calculate correlation,

or perform regression analysis, or run factor analysis (or PCA)

Non-stationary mean (Trend) We often build models to capture the alpha in momentum,

reversals, and other manifestations of a non-stationary mean

But we often ignore those when we calculate correlation, or perform regression analysis, or run factor analysis

Read the fine print…

Page 16: Conditional Correlation 2009

PARALYSIS

Yes, I know, if you read all the fine print and believed that non-stationarity in the data rendered all the techniques useless you’d never do any analysis and would eventually get fired for surfing the internet all day for months and months looking for The Right Approach.

Fries with that?

Page 17: Conditional Correlation 2009

MORE THOUGHTS…

What happens ex-post when we analyze data? IR, IC Is my risk model broken?

Measures such as IR, IC, are also affected by non-stationarity

IC varies over time

Page 18: Conditional Correlation 2009

NONSTATIONARITY AGAIN…HUBER 2001 Manager has 2.33% forecast tracking error and -6.3%

realized return. 3-sigma event or a broken risk model? Risk model on target [ex-ante and ex-post SD both 2.xx] Then why? Unfortunately -50bps per month alpha trend

Typical measures of risk are centered on the trend and thus ignore the risk of being consistently bad (Or good! He could have had +6% return…)

Extending this idea, Qian and Hua (2004) define “strategy risk” as the standard deviation of the manager’s IC over time, and thus “forecast true active risk”:

Forecast Active Risk = std(IC) * Breadth1/2 * Forecast Tracking Error

Huber, Gerard. “Tracking Error and Active. Management”, Northfield Conference Proceedings, 2001, http://www.northinfo.com/documents/164.pdf

Page 19: Conditional Correlation 2009

AN ASIDE… SHARPE RATIO

Trend Effect: in the case of declining markets, the fund with the higher total risk exhibits a higher (less negative) Sharpe ratio…

Another great way to stuff up your Sharpe ratio: Imagine you have a model that only works sometimes

[no, no, I know your models ALWAYS work…] Be really really good sometimes …and in cash the rest of the time [being responsible] => your mean return is low, and your best months are

the ones contributing all your volatility. Another manager doing the same thing with less skill can

have a better Sharpe ratio

Page 20: Conditional Correlation 2009

LINEARITY AFFECTS CORRELATION

Page 21: Conditional Correlation 2009

OUTLIER DEPENDENCE ON CORRELATION

The presence of outliers is problematic for correlation (think about regression)

Use Mahalanobis distance as a test for outliers

Page 22: Conditional Correlation 2009

CORRELATION EXAMPLES

Page 23: Conditional Correlation 2009

ALTERNATIVES

We are looking for a measure of similarity, or shared behavior, or difference, or distance between two series

Ideally we want one with as few restrictions as possible i.e. non-linear, robust to errors, not dependent on a particular distribution, and so on.

Page 24: Conditional Correlation 2009

RELATED MEASURES, ADJUSTMENTS AND ALTERNATIVES

Rank correlation Disattenuation Total correlation / mutual information Cohesion/Coherence Mahalanobis distance Euclidean Distance (special case of MD)

Masochists can read 74 pages on similarity measures in: Sneath PHA & Sokal RR (1973) Numerical Taxonomy. Freeman, San Francisco.

Page 25: Conditional Correlation 2009

RANK CORRELATION

No linearity requirement

where:di = xi − yi = the difference between the ranks of corresponding

values Xi and Yi, and

n = the number of values in each data set (same for both sets).

Page 26: Conditional Correlation 2009

DISATTENUATION (WHAT??)

Somewhat complicated, but essentially just an adjustment to correlation for the presence of estimation error in underlying series

Tends to upward bias correlations

Table 1: Example Disattenuation of Correlation Coefficients  Correlation Coefficient

Reliability estimate: 0.10 (.01) 0.20 (.04) 0.30 (.09) 0.40 (.16) 0.50 (.25) 0.60 (.36)

0.95 0.11 (.01) 0.21 (.04) 0.32 (.10) 0.42 (.18) 0.53 (.28) 0.63 (.40)

0.90 0.11 (.01) 0.22 (.05) 0.33 (.11) 0.44 (.19) 0.56 (.31) 0.67 (.45)

0.85 0.12 (.01) 0.24 (.06) 0.35 (.12) 0.47 (.22) 0.59 (.35) 0.71 (.50)

0.80 0.13 (.02) 0.25 (.06) 0.38 (.14) 0.50 (.25) 0.63 (.39) 0.75 (.56)

0.75 0.13 (.02) 0.27 (.07) 0.40 (.16) 0.53 (.28) 0.67 (.45) 0.80 (.64)

0.70 0.14 (.02) 0.29 (.08) 0.43 (.18) 0.57 (.32) 0.71 (.50) 0.86 (.74)

0.65 0.15 (.02) 0.31 (.10) 0.46 (.21) 0.62 (.38) 0.77 (.59) 0.92 (.85)

0.60 0.17 (.03) 0.33 (.11) 0.50 (.25) 0.67 (.45) 0.83 (.69) ---

Note: Reliability  estimates for this example assume the same reliability for both variables.  Percent variance accounted for (shared variance) is in parentheses.Osborne, Jason W. (2003).

Page 27: Conditional Correlation 2009

MUTUAL INFORMATION

Mutual information/ Total correlation Total Correlation [Watanabe (1960)]

expresses the amount of redundancy or dependency existing among a set of variables.

Page 28: Conditional Correlation 2009

COHESION/COHERENCE

The spectral coherence is a statistic that can be used to examine the relation between two signals or data sets. It is commonly used to estimate the power transfer between input and output of a linear system.

The squared coherence between two signals x(t) and y(t) is a real-valued function that is defined as [1][2]:

where Gxy is the cross-spectral density between x and y, and Gxx and Gyy the autospectral density of x and y respectively. The magnitude or power of the spectral density is denoted as |G|.

Page 29: Conditional Correlation 2009

MAHALANOBIS DISTANCE

MD is one example of a “Bregman divergence” , a group of distance measures.

Clustering: classifies the test point as belonging to that class for which the Mahalanobis distance is minimal. This is equivalent to selecting the class with the maximum likelihood.

Regression: Mahalanobis distance and leverage are often used to detect outliers, especially in the development of linear regression models. A point that has a greater Mahalanobis distance from the rest of the sample population of points is said to have higher leverage since it has a greater influence on the slope or coefficients of the regression equation. Specifically, Mahalanobis distance is also used to determine multivariate outliers. A point can be an multivariate outlier even if it is not a univariate outlier on any variable.

Factor Analysis: recent research couples Mahalanobis distance with Factor Analysis and use MD to determine whether a new observation is an outlier or a member of the existing factor set. [Zhang 2003]

MD depends on covariance (S^-1 is the inverse of the covariance matrix), so is exposed to the same stationarity issues that affect correlation, however as described above it can help us reduce correlation’s outlier dependence.

Page 30: Conditional Correlation 2009

MAHALANOBIS IN ACTION

Borrowed from Kritzman: Skulls, financial turbulence, and theimplications for risk management. July 2009

Page 31: Conditional Correlation 2009

REGRESSION WITH NONSTATIONARY DATA Techniques have been developed

specifically to allow time-varying sensitivities FLS (flexible least-squares) FLS is primarily a descriptive tool that

allows us to gauge the potential for time-evolution of exposures

T

ttt

T

tttttt xy

11

1

11

2

Minimze both sum of squared errors and sum of squared dynamic errors (coefficient estimates)

Page 32: Conditional Correlation 2009

FLS EXAMPLE

An example from Clayton and MacKinnon (2001) The coefficient apparently exhibits structural shift in 1992

Page 33: Conditional Correlation 2009

FACTOR ANALYSIS WITH NONSTATIONARY DATA

Dahlhaus, R. (1997). Fitting Time Series Models to Nonstationary Processes. Annals of Statistics, Vol. 25, 1-37.

Del Negro and Otrok (2008): Dynamic Factor Models with Time-Varying Parameters: Measuring Changes in International Business Cycles (Federal Reserve Bank New York)

Eichler, M., Motta, G., and von Sachs, R. (2008). Fitting dynamic factor models to non-stationary time series. METEOR research memoranda RM/09/002, Maastricht University.

Stock and Watson (2007): Forecasting in dynamic factor models subject to structural instability (Harvard).

There are techniques available, and they are being applied to financial series.

Page 34: Conditional Correlation 2009

CLUSTER ANALYSIS WITH NONSTATIONARY DATA

Guedalia, London, Werman; “An on-line agglomerative clustering method for nonstationary data” Neural Computation, February 15, 1999, Vol. 11, No. 2, Pages 521-54

C. Aggarwal, J. Han, J. Wang, and P. S. Yu, On Demand Classification of Data Streams, Proc. 2004 Int. Conf. on Knowledge Discovery and Data Mining (KDD'04), Seattle, WA, Aug. 2004.

G. Widmer and M. Kubat, “Learning in the Presence of Concept Drift and Hidden Contexts”, Machine Learning, Vol. 23, No. 1, pp. 69-101, 1996.

Again, there are techniques available to conquer the problem

Page 35: Conditional Correlation 2009

CLUSTERING – ARTIFICIAL IMMUNE SYSTEMS

Non-stationary clustering is also related to the development of artificial immune systems:

Modeling evolving data sets, you can think of data as “born” with a set of factors (immunity) and subsequently develops immunity to new effects as they appear.

You can decide whether/how much memory their should be of new and non-current factors.

Each new observation could be one of three things: a member of an existing cluster, the first member of a new cluster, or an outlier.

Page 36: Conditional Correlation 2009

THE STORY SO FAR

A long time ago (in a galaxy far far away?) correlation was stable.

Recent evidence tends to suggest that correlation is very much regime-dependent, and the consensus seems to be that two regimes are sufficient.

Correlation can be upset by outliers, trend, noise, and non-linearity

Correlation is the “default” choice in regression analysis, factor analysis, and cluster analysis – the core of our toolkit

More recent techniques, alternative measures, and adjustments exist to combat these effects.

Page 37: Conditional Correlation 2009

WHAT’S LEFT

The linear multi-factor risk model Adjusting the model for non-stationarity Making the model “conditional” by

allowing separate regimes Lunch

Page 38: Conditional Correlation 2009

THE LINEAR MULTI-FACTOR RISK MODEL

Relationship between R and F is linear ∀F The Exposures E are the correlations between R

and F. There are N common factor sources of return The distribution of F is stationary, Normal, i.i.d.

∀F (Implicitly also the volatility of R and F is

stationary)

N

istititst SFER

1

Page 39: Conditional Correlation 2009

EFFECT ON RISK MODEL ESTIMATION

If not properly addressed during estimation: Time-Series (macro model): effect will be

observed in exposures and correlations Cross-sectional (fundamental): effect

observed in correlation (covariance) matrix Statistical: effect observed in factor

loadings (from FA), factor returns (from regression analysis) and in covariance matrix. Ouch!

Page 40: Conditional Correlation 2009

LIMITATIONS OF THE MODEL

Symmetric – the response is the same whether a factor increases or decreases

Linear – the response is the same for any size move in a factor

Stationary – we are assuming that volatility and mean is stationary for all the factors

Page 41: Conditional Correlation 2009

ADJUSTMENTS FOR NON-STATIONARITY

Exponential Weighting Conditional variances Parkinson range measure Cross-sectional dispersion-inferred time-

series variance adjustments Implied volatility Serial-correlation adjustments in short-

horizon models

(WARNING: MARKETING PLUG: all currently used in the Northfield risk model family)

Page 42: Conditional Correlation 2009

IS A SINGLE CORRELATION ADEQUATE?

Recent evidence suggests that correlations change over time, and particularly during market turmoil: If true, this impacts our ability to diversify just when we need it

most This means our risk analysis may over-estimate the benefits of

diversification when reporting our risk Optimized portfolios may be exposed to higher risk than we

thought One simple correction would be to have two sets of risk

model factors, variances, and correlations Estimate “normal” and “turbulent” volatilities and correlations

for factors Scale exposures to account for the probability that each regime

will be visible within the risk forecast horizon

Page 43: Conditional Correlation 2009

IMPLEMENTATION

Use Viterbi’s algorithm (Viterbi 1967) to detect states.

Use Jennrich tests (Jennrich 1970) to decide whether correlation differences between states are significant

Page 44: Conditional Correlation 2009

IMPLEMENTATION 2

Take a factor model First Pass: “clone” the factor set and assume all factors apply in

both regimes Estimate factor variances and correlations separately for the two

regimes Weight the security exposures to the factors based on the

probability of each regime: E.g. Regime 1 90%, Regime 2 10% Single correlation model Market exposure for security X = 1.25 In the two-regime model the Market exposure to “Market Normal” factor

is 0.90*1.25, and the exposure to “Market Turbulent” factor is 0.10*1.25 Set correlations between the two models equal to zero across

models (i.e. Correlation (“Market Normal”, “Market Turbulent”) = 0)

Run risk analysis / optimization as usual

Page 45: Conditional Correlation 2009

LIMITATIONS

Still linear – no accommodation of asymmetric correlation (i.e. UP ≠ DOWN)

Same set of factors used Opportunity to estimate completely different

betas for each regime – small-sample problems?

Pessimistic? If the probability assigned to the turbulent regime is high, risk estimates and optimal portfolios will be very conservative – perhaps too conservative?

Page 46: Conditional Correlation 2009

CONCLUSIONS

Correlation lies at the heart of our favorite tools for data analysis (and risk model estimation)

A clear understanding of its behavior is a requirement for good analysis

Recent developments aid the incorporation of time-varying correlation and/or non-stationary processes.

Recent research strongly indicates that correlation is regime dependent

We can incorporate multiple regimes into standard risk models simply, but linearity/symmetry assumptions remain

Overall conclusion: read more papers?

Page 47: Conditional Correlation 2009

REFERENCES

Ang, A. and Bekaert, G. (1999) ‘International Asset Allocation with time-varying Correlations’, working paper, Graduate School of Business, Stanford University and NBER.

Banerjee, Arindam; Merugu, Srujana; Dhillon, Inderjit S.; Ghosh, Joydeep (2005). "Clustering with Bregman divergences". Journal of Machine Learning Research 6: 1705–1749. http://jmlr.csail.mit.edu/papers/v6/banerjee05b.html.

Bertero, E. and Mayer, C. (1989) ‘Structure and Performance:Global Interdependence of Stock Markets around the Crash of October 1987’, London, Centre for Economic Policy Research.

Chesnay, F. and Jondeau, E. (2001) ‘Does Correlation between Stock Returns really increase during turbulent Periods?’, Economic Notes by Banca Monte dei Paschi di Siena SpA, Vol. 30,No. 1, pp.53–80.

Jim Clayton and Greg MacKinnon (2001), "The Time-Varying Nature of the Link Between REIT, Real Estate and Financial Asset Returns" (pdf,6.3M), Journal of Real Estate Portfolio Management, January-March Issue

Erb, C.B., Harvey, C.R. and Viskanta, T.E. (1994) ‘Forecasting international Equity Correlations’, Financial Analysts Journal,pp.32–45.

Jakulin A & Bratko I (2003a). Analyzing Attribute Dependencies, in N Lavra\quad{c}, D Gamberger, L Todorovski & H Blockeel, eds, Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, Springer, Cavtat-Dubrovnik, Croatia, pp. 229-240

Jennrich R. (1970) ‘An Asymptotic χ2 Test for the Equality of Two Correlation Matrices’, Journal of the American Statistical Association,Vol. 65, No. 330.

Page 48: Conditional Correlation 2009

REFERENCES

R. Kalaba, L. Tesfatsion. Time-varying linear regression via flexible least squares. International Journal on Computers and Mathematics with Applications, 1989, Vol. 17, pp. 1215-1245.

Kaplanis, E. (1988) ‘Stability and Forecasting of the Comovement Measures of International Stock Market Returns’, Journal of International Money and Finance, Vol. 7, pp.63–75.

Lee, S.B. and Kim, K.J. (1993) ‘Does the October 1987 Crash strengthen the co-Movements among national Stock Markets?’,Review of Financial Economics, Vol. 3, No. 1, pp.89–102.

Longin, F. and Solnik, B. (1995) ‘Is the Correlation in International Equity Returns constant: 1960–1990?’, Journal of InternationalMoney and Finance, Vol. 14, No. 1, pp.3–26.

Longin, F. and Solnik, B. (2001) ‘Extreme Correlation of International Equity Markets’, The Journal of Finance, Vol. 56, No.2.

Mahalanobis, P C (1936). "On the generalised distance in statistics". Proceedings of the National Institute of Sciences of India 2 (1): 49–55. http://ir.isical.ac.in/dspace/handle/1/1268. Retrieved 2008-11-05

Nemenman I (2004). Information theory, multivariate dependence, and genetic network inference

Page 49: Conditional Correlation 2009

REFERENCES

Osborne, Jason W. (2003). Effect sizes and the disattenuation of correlation and regression coefficients: lessons from educational psychology. Practical Assessment, Research & Evaluation, 8(11).

Qian, Edward and Ronald Hua. “Active Risk and the Information Ratio”, Journal of Investment Management, Third Quarter 2004.

Ramchand, L. and Susmel, R. (1998) ‘Volatility and Cross Correlation across major Stock Markets’, Journal of Empirical Finance, Vol. 5, No. 4, pp.397–416.

Ratner, M. (1992) ‘Portfolio Diversification and the inter-temporal Stability of International Indices’, Global Finance Journal, Vol. 3, pp.67–78.

Sheedy, E. (1997) ‘Is Correlation constant after all? (A Study of multivariate Risk Estimation for International Equities)’, working paper.

Sneath PHA & Sokal RR (1973) Numerical Taxonomy. Freeman, San Francisco. Tang, G.: ‘Intertemporal Stability in International Stock Market Relationships: A Revisit’, The

Quarterly Review of Economics and Finance, Vol. 35 (Special), pp.579–593.Sharpe W. F., "Morningstar’s Risk-adjusted Ratings", Financial Analysts Journal, July/August 1998, p. 21-33.

Viterbi, A. (1967) ‘Error Bounds for convolutional Codes and an asymptotically Optimum Decoding Algorithm’, IEEE Transactions on Information Theory, Vol. 13, No. 2, pp.260–269.Watanabe S (1960). Information theoretical analysis of multivariate correlation, IBM Journal of Research and Development 4, 66-82.

Yule, 1926. G.U. Yule, Why do we sometimes get nonsense-correlations between time series?. Journal of the Royal Statistical Society 89 (1926), pp. 1–69.