ROBUST STATISTICS IN PORTFOLIO CONSTRUCTION 1 R. Douglas Martin Professor of Applied Mathematics & Statistics Director of Computational Finance Program

ROBUST STATISTICS IN PORTFOLIO CONSTRUCTION

1

R. Douglas MartinProfessor of Applied Mathematics & StatisticsDirector of Computational Finance Program

University of Washington

[email protected]

Parts of R-Finance Conference 2012 Tutorialfor

R-Finance 2013 Chicago, May 17-18

1. Why robust statistics in finance?

2. R robust library

3. Robust factor models

4. Robust volatility clustering models

5. Robust covariance & correlation- Next talk

Appendix: Robustness concepts & theory

2

1. WHY ROBUST STATS IN FINANCE?

All classical estimates are vulnerable to extreme distortion by outliers, and financial data has outliers to various degrees. You need robust estimates that are:

Not much influenced by outliers

Good model fit to bulk of the data

Reliable returns outlier detection

Stable inference and prediction

3

Important Points

Myth: Robust statistical methods throw away what may be the most important data values

Model fit is only the first step in the analysis, and robust statistics reliably detects outliers that are often completely missed with classical estimates. Then can check if they are important or just noise.

Do not use blindly in risk applications

Outlier rejection can give misleading optimism about risk. But they can sometimes lead to pessimism.

4

5

John Tukey (1979): “… just which robust methods you use is not important – what is important is that you use some. It is perfectly proper to use both classical and robust methods routinely, and only worry when they differ enough to matter. But when they differ, you should think hard.”

Robust Library Motivation: Make it possible for everyone to easily compute classical and robust estimates!

Cavet : It is important which robust method you use! Some are much better than others.

2. R PACKAGE robust

6

Author: Jiahui Wang, Ruben Zamar, Alfio Marazzi, Victor Yohai, Matias Salibian-Barrera, Ricardo Maronna, Eric Zivot, David Rocke, Doug Martin, Martin Maechler, Kjell Konis.

Maintainer: Kjell Konis [email protected]

R Package robust

Originally created by Insightful under an NIH SBIR grant, with Doug Martin as P.I., Kjell Konis (primary) and Jeff Wang as developers. Given over to R by Insightful with Kjell Konis as maintainer. The CRAN site shows:

There is also the R package robustbase, with a lot of people working on it. Kjell has a goal of putting as many of the “robust” library methods (current and future) into “robustbase” as he can manage.

7

3. ROBUST FACTOR MODELS

More representative decomposition of risk into factor risk and specific risk

More representative covariance matrix for MV portfolio optimization, directly or via a factor model

Better betas

Robust asset pricing models

8

Optimal Bias Robust M-Estimates

T,

t=1

1, , argminˆ , ˆ

k t t kk

ok

Kr

ks

βf β

β

must be bounded hence non-convex for robustness,which requires sophisticated optimization.

,

k=1

1, , argminˆ , t ˆ

Kk t k t

to

tK

r

s

fβ f

f

Time Series Factor Models for FoF’s (factor returns known)

Fundamental Factor Models (exposures/betas known)

9

Optimal* Rho and Psi

0

1

2

3

4

-5.0 -2.5 0.0 2.5 5.0

85% efficiency90% efficiency95% efficiency

-2

-1

0

1

2

-5.0 -2.5 0.0 2.5 5.0

85% efficiency90% efficiency95% efficiency

*Minimizes maximum bias due to outliers with only somewhat higher variance OLS when returns are normally distributed. See Yohai and Zamar (1997), Svarc, Yohai and Zamar (2002).

MARKET RETURNS

EV

ST

RE

TU

RN

S

-0.15 -0.10 -0.05 0.0 0.05

02

46

LS BETA = 3.17ROBUST BETA = 1.13

1, ,, , t Tt m t tr r

10Robust vs LS Betas

CLASSICAL BETA = 3.17

This is what you get today:a very misleading result, caused by one outlier, and predicts future betas poorly!

ROBUST BETA = 1.13

Fits bulk of data to accurately describe the typical risk and return, and predicts future betas.

Outlier

See Martin and Simin (2003), Financial Analysts Journal

11

-10 -5 0 5 10

-10

-50

510

1520

(a)

Market Returns, %

ME

R R

etur

ns,

%

Robust:̂ 1.80.09OLS:̂ 1.50.1

-10 -5 0 5 10

-60

-40

-20

020

(b)

Market Returns, %

ED

S R

etur

ns,

%

Robust:̂ 1.410.18OLS:̂ 2.030.26

-6 -4 -2 0 2 4

-20

020

40

(c)

Market Returns, %

VH

I R

etur

ns,

%

Robust:̂ 0.630.23OLS:̂ 1.160.31

-25 -20 -15 -10 -5 0 5

-25

-20

-15

-10

-50

5

(d)

Market Returns, %

DD

Ret

urns

, %

Robust:̂ 1.20.128OLS:̂ 1.190.076

20-Oct-1987

Robust vs. LS Market Models*

* Bailer, Maravina and Martin (2012)

12

Robust Regression of Returns vs. Size

Fama and French (1992) results: equity returns are negatively related to firm size when using LS

Knez and Ready (1997) results: returns are positively related to size for the vast majority of the data when using LTS regression

Positive relationship is rather constant for all trimming between 50% and 5%, and the relationship remains positive even at 1% trimming. Furthermore there is little loss of efficiency in the 1% to 5% trimming range

13

size

retu

rns

2 4 6 8 10

-20

02

04

06

0EQUITY RETURNS VERSUS FIRM SIZE

ROBUSTLEAST SQUARES

Monthly Returns July 1963

14

LEAST SQUARES LINEROBUST LEAST-TRIMMED-SQUARES LINE

0

100

200

300

0 2 4 6 8 10

FIRM SIZE

EQ

UIT

Y R

ET

UR

NS

JAN. 1980

Monthly Returns January 1980

15

Time Series of Slope Coefficients

Time in months

-20

2

Jul 63 Dec 68 May 74 Oct 79 Mar 85 Aug 90

LS SLOPE COEFFICIENTS: Regressions of Returns on Size

Time in months

-20

2

Jul 63 Dec 68 May 74 Oct 79 Mar 85 Aug 90

ROBUST LTS SLOPE COEFFICIENTS: Regressions of Returns on Size

Fama-French:t-statistic indicatesnegative relation

Knez-Ready:t-statistic indicatespositive relation(for vast majority of firms)!

Robust Fundamental Factor Models

R package now in “factorAnalytics”, originally created by Chris Green and others, with Doug Martin, and ported to R by Guy Yollin. Likely to see further development

Example of use in next slides

16

17

-0.2

-0.1

0.0

0.1

0.2

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q41999 2000 2001 2002 2003

BOOK2MARKET.MM-0.1

00

.00

0.1

0

EARN2PRICE-0.1

0.0

0.1

0.2

0.3

0.4

LOG.MARKET.CAP.MM

Times Series of Factor Returns

Classical Robust

Robust versus Classical Factor ReturnsThree risk factors: size, E/P, B/M, monthly returns

18

-0.5 0.0 0.5 1.0

0.0

0.5

1.0

1.5

2.0

residual correlations

Den

sity

Densities of residual correlationsClassical Robust

Residuals Cross-Section Correlations

19

4. ROBUST VOLATILITY ESTIMATES

ROH RETURNS

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q12000 2001 2002

-0.1

5-0

.05

0.0

5

ROH CLASSIC EWMA VOLATILITY

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q12000 2001 2002

0.0

00.0

10.0

20.0

30.0

40.0

5

2 2 21 t t 1 oˆ ˆ (1 ) r , t tt Classic EWMA:

Over-estimates volatilities after outlier returns!

20

Robust EWMA Volatility Estimates

ROH CLASSIC EWMA VOLATILITY

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q12000 2001 2002

0.00

0.01

0.02

0.03

0.04

0.05

ROH ROBUST EWMA VOLATILITY

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q12000 2001 2002

0.00

0.01

0.02

0.03

0.04

0.05

21

2 2 21 t t 1 t 1

2t 1

ˆ ˆ ˆ (1 ) r , if r

ˆ ˆ , if r

t t

t t

a

a

ˆt

tt

rUMT

Robust EWMA

Unusual Movement Test Statistic

22

ROH UNUSUAL MOVEMENT TESTU

MT

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q12000 2001 2002

0.0

2.0

ROH ROBUST UNUSUAL MOVEMENT TEST

RO

BU

ST

UM

T

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q12000 2001 2002

01

23

45

6

Robust EWMA Scherer and Martin (2005), Section 6.4.2

pt

ptt

ptt r

C

11,,1

~ ,1,0

0

p

t

t

C E r

r S

p

23

Robust EWMA and GARCH References

• Robust GARCH

– Franses, van Djik and Lucas (1998)– Gregory and Reeves (2001)– Park (2002)– Muler and Yohai (2006)– Boudt and Croux (2006)

• Stable Distribution EWMA– Stoyanov (2005)

Uses in Portfolio Management

Exploring asset returns correlations

Detecting multi-dimensional outliers

Robust mean-variance portfolio optimization

Types of Estimates M-estimates (Maronna, 1976)

Min. covariance det. (MCD) (Rousseeuw, 197x)

Pairwise estimates (Maronna & Zamar, 197x)

24

5. ROBUST COVARIANCES

25

2 1ˆˆ ˆt t td

r μ Σ r μ

t tz z

Robust Distances

Euclidean distance in new “spherized” coordinate system.

Classical version uses classical sample mean and sample covariance estimates. Replace them with highly robust versions, e.g., sample median and Fast Minimum Covariance Determinant (MCD) estimate.

So-called Mahalanobis distance

Fundamental Factors Multi-D Outliers

Fundamental factor model context

4-D example: Size, B/M, E/P, Momentum

Monthly data 1995-2006

1,046 equities

26

Martin, R. D., Clark, A and Green, C. G. (2010).

27

0

5

10

15

20CLASSICAL DISTANCES AFTER 5% WINSORIZATION

0

5

10

15

20

0 200 400 600 800 1000

ROBUST DISTANCES AFTER 5% WINSORIZATION

Index

CL

AS

SIC

AL

AN

D R

OB

US

T D

IST

AN

CE

S

Robust vs Classical Outlier DetectionMonth ending 7/31/2002

28

50

10

01

50

20

02

50

1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Date

NU

MB

ER

OF

OU

TL

IER

S D

ET

EC

TE

D

CLASSICAL DISTANCES AFTER 5% WINSORIZATIONROBUST DISTANCES AFTER 5% WINSORIZATION

Number of 4-D Outliers Detected

Appendix. Robustness Concepts & Theory

Efficiency Robustness

Bias Robustness

Other Concepts

Min-max robustness Continuity Bounded influence Breakdown point

29

Standard Outlier Generating Model30

HFF θ)1(

Unknown asymmetric, or non-elliptical distributions

Nominal parametric distribution

Unknown, often “smallish” (.01 to .02-.05) but want need protection for “large” values up to .5.

Robustness: Doing well near a parametric model.In most applications is a normal distribution.Fθ

31

Efficiency Robustness*

High efficiency when the data distribution F is normal and also when F is “nearly normal” with fat tails, where

NOTE: Tukey favored an empirical “tri-efficiency” metric with one distribution normal, one mildly fat-tailed (normal mixture) and one very fat-tailed (Cauchy tails), and sometimes use “best known” estimate in place of MLE.

ˆvar( , )ˆ( , )ˆvar( , )

MLEROBUST

ROBUST

FEFF F

F

* Tukey (1960). “A Survey of Sampling from Contaminated Distributions”

Bias Robustness

32

VAR , 0 nT n as

2MSE , VAR , B , n n nT T T

Choose to:

Want close to smallest attainable MSE

Since

B , ; nT F

nT

(1 )F F H θforminimize

33Maximum Bias Curves

B T,All classicalestimates

T1 T2

BP1 BP2= .5

“Breakdown”points

Maximum bias of T over all H in HFF θ)1(

May beasymmetric

GES: influence function maximum

T3

34

Min-Max Bias Robust Estimates

Location estimation (Huber, 1964)

− Sample median (“Leads to uneventful theory”)

Scale estimation (Martin and Zamar, 1989, 1991)

− For nominal exponential model: adjusted median− Median absolute deviation about the median (MADM)*

− Shortest half of the data (SHORTH)*

Regression estimation: (Martin, Yohai & Zamar, 1989) and many other, including Maronna, Martin & Yohai (2006)

* Approximately for all fractions of outliers, exact as .5

References

Maronna, R. Martin, R.D., and Yohai, V. J. (2006). Robust Statistics : Theory and Methods, Wiley.

Martin, R. D., Clark, A and Green, C. G. (2010). “Robust Portfolio Construction”, in Handbook of Portfolio Construction: Contemporary Applications of Markowitz Techniques, J. B. Guerard, Jr., ed., Springer.

Bailer, H., Maravina, T. and Martin, R. D. (2012). “Robust Betas for Asset Managers”, in The Oxford Handbook of Quantitative Asset Management, Scherer, B. and Winston, K., editors, Oxford University Press.

Chapter 6 of Scherer, B. and Martin, R. D. (2004). Modern Portfolio Construction, Springer

35

Documents

ROBUST STATISTICS IN PORTFOLIO CONSTRUCTION 1 R. Douglas Martin Professor of Applied Mathematics & Statistics Director of Computational Finance Program