98
Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University Department of Management Science & Engineering Market Microstructure and High-Frequency Data June 3, 2017

Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Large Dimensional Factor Modeling Based onHigh-Frequency Observations.

Markus Pelger

Stanford UniversityDepartment of Management Science & Engineering

Market Microstructure and High-Frequency DataJune 3, 2017

Page 2: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Motivation

Systematic pattern in high-frequency data

0 5 10 15 20 25 30 35 40 45−3.5

−3

−2.5

−2

−1.5

−1

−0.5

0

0.5Log−prices of financial firms

time in days

log−

pric

e

Log-price for financial institutions of the S&P500 for 2 months after September

16th 2008

Page 3: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Motivation

Systematic pattern in jumps of high-frequency data

0 5 10 15 20 25 30 35 40 45−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4Jumps in log−prices of financial firms

time in days

log−

pric

e

Estimated jumps of log-price for financial institutions of the S&P500 for 2

months after September 16th 2008

Page 4: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Motivation

Systematic pattern for unobservable spot volatility

0 50 100 150 200 250 3000

50

100

150Spot volatility

days

spot

vol

atilit

y

Estimated spot volatility for firms in the S&P500 for 2008

Page 5: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Motivation

Motivation: Research questions

Understanding systematic risk with factors

Goal of this paper: Using high-frequency data to better understandsystematic factor risk

Key elements of this paper:

1 Statistical factors instead of pre-specified (and potentiallymiss-specified) factors

2 Analyze time-variation in systematic factors without imposingrestrictions on time-dependency

Research questions:

1 How many systematic factors?2 What are the factors?3 How stable is the factor structure?4 Are continuous factors different from jump factors?5 Do factors explain the cross-section of asset returns?6 Leverage effect due to systematic or idiosyncratic risk?

Page 6: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Motivation

This paper: Statistical theory for answering questions

Large-dimensional high-frequency factor analysis

Estimating factors, loadings and number of factors for large numberof cross-sectional and high-frequency observations

Estimate unknown factors with principal component analysis

High-frequency data: Estimate factors for different short timehorizons independently

⇒ analyze time-variation in factors

Approximate factor structure imposes only weak assumptions

Empirical application to U.S. equity market

5-minutes prices for the S&P500 firms from 2003 to 2012Daily implied volatilities for the S&P500 firms from 2003 to2012

Page 7: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Motivation

Contribution of the Estimation Theory

Contributions (Theory)

1 Approximate factor analysis: Extend distribution theory of largedimensional factor analysis to high-frequency observations(Apply PCA to quadratic covariation instead of covariance)

2 Develop estimation strategy for continuous and jump factors(Truncated quadratic covariation matrix)

3 New estimator for number of factors(Perturbed eigenvalue ratio statistic)

4 New test for comparing different sets of factors(Sum of generalized correlation statistic)

⇒ Combining high-frequency econometrics, principal componentanalysis, large dimensional factor modeling

Page 8: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Empirical results

Contribution of the Empirical Application

Contributions (empirical)

1 Continuous factors: Stable factor structure

4 continuous factors for 2007-2012: Market, oil, finance andelectricity3 continuous factors for 2003-2006: Market, oil and electricity

2 Jump factors

1 stable jump factor: Market

3 Cross-sectional asset pricing:

Intraday factors explain intraday expected returns

4 Decomposing the leverage effect:

Negative correlation of return and volatility due to systematicrisk

⇒ Rules out financial leverage story

Page 9: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Literature

Literature (partial list)

Factor models for unknown factors for long-horizons

Bai et al. (2002, 2003): High-dimensional factor modelsOnatski (2010): Determining the number of factorsFan et al. (2013): Sparse matrices in factor modeling

High-frequency econometrics

Bollerslev and Todorov et al. (2010, 2016): Continuous andjump market risk are differentJacod (2008): Asymptotic properties of functionals ofsemimartingalesAıt-Sahalia and Jacod (2009), Lee and Mykland (2008),Mancini (2009): Estimating jumps

Large dimensional high-frequency factor modeling

Aıt-Sahalia and Xiu (2016): Sparsity assumption forcontinuous riskFan, Furger and Xiu (2014): Known factors

Page 10: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

High-frequency factor analysis: Setup

High-frequency factor analysis

High-frequency factor analysis: Setup

True log-price process in continuous time

Xi (t) = Λi1×K︸︷︷︸

loadings

F (t)K×1︸︷︷︸factors

+ ei (t)︸︷︷︸idiosyncratic

i = 1, ...,N t ∈ [0,T ]

Observed log-price process at discrete time points

Xi (tj) = ΛiF (tj)︸ ︷︷ ︸systematic

+ ei (tj)︸ ︷︷ ︸non−systematic

for j = 1, ...,M withT

M= tj+1 − tj

N assets (large)time horizon T (fixed)M high-frequency observations (large)K systematic factors (fixed)

Λ, F (t) and e(t) are unknown

Page 11: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

High-frequency factor analysis: Setup

Quadratic covariation as a replacement for covariance

Conventional factor analysis

analyze principal components (eigenvalues) of covariance matrix

Cov(X ) = 1T

∑Tt=1(X (t)− X )2 with X = 1

T

∑Tt=1 X (t)

covariance matrix cannot be estimated for fixed T

⇒ replace covariance by quadratic covariation

Quadratic covariation process: Sum of squared increments

X (t) is semimartingale.

Partition [0,T ] into M subintervals with mesh size ∆M = TM → 0

M∑j=1

(X (tj+1)− X (tj))2 p−→ [X ,X ]T︸ ︷︷ ︸quadratic variation

Page 12: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

High-frequency factor analysis

Estimation approach

Estimation approach

M observations of the process X in the time interval [0,T ]

M →∞ and N →∞ and T fixed

sampling frequency ∆M = TM = tj+1 − tj → 0

Xj,i = Xtj+1,i − Xtj ,i Fj = Ftj+1 − Ftj ej,i = etj+1,i − etj ,i

X(M×N)

= F(M×K)

Λ>(K×N)

+ e(M×N)

We want to analyse Λ and F :

VNM := K largest eigenvalues of X>XN

Λ :=√N· eigenvectors of VNM

F := X ΛN

Page 13: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

High-frequency factor analysis

Approximate factor model

Assumptions for approximate factor model

Factor structure

X (t)N×1

= ΛN×K

F (t)K×1

+ e(t)N×1

F and ei are Ito-semimartingales with weak integrability conditions,F and ei are independent

ei are local martingales

Weak dependence of diversifiable risk: [e, e]T and 〈e, e〉T havebounded eigenvalues.

Identification criterion for ΣF = [F ,F ]T and ΣΛ = limN→∞Λ>ΛN :

full rank and eigenvalues of ΣΛΣF are distinct.

For some stronger results we need weaker serial and cross-sectionaldependence of the idiosyncratic terms

Page 14: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

High-frequency factor analysis

Results

Main results

Assume M,N →∞

Consistent estimation of the number of factors K

Consistent estimation of Λ

Consistent estimation of F

Consistent estimation of common component Cj,i = FjΛ>i

Under additional weak technical assumptions: Asymptoticmixed-normality of estimators

⇒ Curse of dimensionality turns into a “blessing”

Page 15: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

High-frequency factor analysis

Separating jumps from continuous movements

Separating continuous and jump part

Intuition: Large increments = jumps

Truncation estimator for continuous part:

XCj,i = Xj,i1{|Xj,i |≤aj,i∆ω}

instead of Xj,i with aj,i > 0 and ω ∈ (0, 12 )

Truncation estimator for jump part:

XDj,i = Xj,i1{|Xj,i |>aj,i∆ω}

Set aj,i∆ω = a · σC

j,i ·∆0.49 with σCj,i local window estimator of

volatility and a = 3, 4 and 4.5

⇒ Asymptotic results for loadings and and number of factors hold forcontinuous and jump part separately for finite activity jumps

Page 16: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

High-frequency factor analysis

Estimation of jump and continuous risk

Estimate jump and continuous factors

X (t) = ΛCFC (t) + ΛDFD(t) + e(t).

Truncation estimator

Use truncation estimator to construct estimators for

[X ,X ]︸ ︷︷ ︸total risk covariance

[X ,X ]C︸ ︷︷ ︸continuous risk covariance

[X ,X ]D︸ ︷︷ ︸jump risk covariance

Apply PCA to XC>XC

N ⇒ ΛC and FC

Apply PCA to XD>XD

N ⇒ ΛD and FD

⇒ Estimators ΛC , ΛD , FC and FD are defined analogously to Λ and F ,but using XC and XD instead of X .

Page 17: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

High-frequency factor analysis

Number of factors

Theorem: Number of factors

Intuition: Number of factors = number of large eigenvalues

Denote the ordered eigenvalues of X>X by λ1 ≥ ... ≥ λN .

Choose sequence g(N,M) s.t g(N,M)N → 0 and g(N,M)→∞.

I recommend g(N,M) =√N · median(λ1, ..., λN).

Define perturbed eigenvalues and eigenvalue ratio statistics:

λk = λk + g(N,M) ERk =λk

λk+1

for k = 1, ...,N − 1

The estimator for the number of factors is for any c > 0:

K (c) = max{k ≤ N − 1 : ERk > 1 + c}

⇒ Under weak assumptions it holds K (c)p→ K for any c > 0.

Page 18: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

High-frequency factor analysis

Identifying factors

Problem: Factors only identified up to invertible lineartransformations

Need a measure for how close vector spaces are to each other

Generalized correlations

Set of KG economic candidate factors G

Generalized correlations are the min(K ,KG ) largest eigenvalues ofthe matrix [F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]

Example: Generalized correlations {1, 1, 0} for K = KG = 3: ⇒linear combination of G can replicate 2 factors in F

⇒ Use F for calculating generalized correlations

⇒ Asymptotic confidence intervals for sum of squared correlations:Total generalized correlation: trace

([F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]

).

Page 19: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Data

Price data

Data

Time period: 2003 to 2012

Xi (t) is the log-price from the TAQ database

N between 500 and 600 firms from the S&P 500

5-min sampling: on average 250 days with 77 increments each

Data cleaning:

Delete all entries with a time stamp outside 9:30am-4pmDelete entries with a transaction price equal to zeroRetain entries originating from a single exchangeDelete entries with corrected trades and abnormal salecondition.Aggregate data with identical time stamp usingvolume-weighted average prices

Page 20: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Data

Log price of S&P 500 firms in 2012.

0 50 100 150 200 250 300−5

0

5Log stock price in 2012

days

log

pric

e

0 50 100 150 200 250 300−1

0

1

2Systematic component

days

log

pric

e

0 50 100 150 200 250 300−5

0

5Idiosyncratic component

days

log

pric

e

Decomposing prices into systematic and idiosyncratic component, 2012, K = 4.

Page 21: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Data

Continuous part of log price of S&P 500 firms in 2012.

0 50 100 150 200 250 300−5

0

5Log stock price in 2012

days

log

pric

e

0 50 100 150 200 250 300−1

0

1

2Systematic component

days

log

pric

e

0 50 100 150 200 250 300−5

0

5Idiosyncratic component

days

log

pric

e

Decomposing continuous movements in 2012 for K = 4 and a = 3.

Continuous movements are 85% of total variation.

Systematic part is 32% of continuous variation.

99.2% of movements continuous.

Page 22: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Data

Discontinuous part of log price of S&P 500 firms in 2012.

0 50 100 150 200 250 300−5

0

5Log stock price in 2012

days

log

pric

e

0 50 100 150 200 250 300−0.5

0

0.5

1Systematic component

days

log

pric

e

0 50 100 150 200 250 300−5

0

5Idiosyncratic component

days

log

pric

e

Decomposing discontinuous movements in 2012 for K = 1 and a = 3

Jump movements are 15% of total variation.

Systematic part is 25% of jump variation.

0.8% of movements jumps.

Page 23: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of continuous factors

Number of continuous factors

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

201220112010Critical value

200920082007Critical value

2006200520042003Critical value

Number of continuous factors

Page 24: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of continuous factors

Identification of factors

Continuous factors

Intuition: Identify pattern in the loadings

4 economic candidate factors:

Market (equally weighted)Oil and gas (40 equally weighted assets)Banking and Insurance (60 equally weighted assets)Electricity (24 equally weighted assets)

Page 25: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of continuous factors

Main result: Identification of factors

4 continuous factors with industry continuous factors1.00 0.98 0.95 0.80

4 jump factors with industry jump factors0.99 0.75 0.29 0.05

4 continuous factors with Fama-French Carhart Factors0.95 0.74 0.60 0.00

Table: Generalized correlations of first four largest statistical factors for2007-2012 with economic factors

Generalized correlations close to 1 measure of how many factors twosets have in common

Economic industry factors: Market, oil, finance, electricity

⇒ Jump structure different from continuous structure

⇒ Size, value, momentum do not explain factors

Page 26: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of continuous factors

Identification of factors

2007-2012 2007 2008 2009 2010 2011 2012

1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.97 0.99 0.97 0.98 0.930.95 0.91 0.95 0.95 0.93 0.94 0.900.80 0.87 0.78 0.75 0.75 0.80 0.76

Generalized correlation of market, oil, finance and energy factors withfirst four largest statistical factors for 2007-2012

⇒ Stable continuous factor structure

Page 27: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of continuous factors

Identification of factors

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.97 0.99 1.00 1.00 0.99 0.97 0.98 0.96 0.98 0.950.57 0.75 0.77 0.89 0.85 0.92 0.95 0.92 0.93 0.830.10 0.23 0.16 0.35 0.82 0.74 0.72 0.68 0.78 0.78

Generalized correlation of market, oil, finance and energy factors withfirst four largest statistical factors for 2003-2012

⇒ Finance factor disappears in 2003-2006

Page 28: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Cross-sectional Asset Pricing

Cross-sectional Asset Pricing

Arbitrage Pricing Theory (APT)

APT (Ross (1976), Chamberlain (1988), Reisman (1988)):Expected excess return is explained by systematic risk:

E [Xi ] = E [F ]Λi

Dynamic APT applies that expected intraday returns should beexplained by systematic intraday risk (same for overnight)

Dynamic APT allows for time-varying loadings

Data

Expected returns from 2003 to 2012 for N = 304 stocks

Intraday, overnight and daily factors based on 4 continuous factors

Intraday, overnight and daily 3 Fama-French (market, size, value)

Weekly estimation of loadings

Page 29: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Cross-sectional Asset Pricing

Time-variation in loadings

Time-variation in loadings

PCA 0.983 0.953 0.924 0.827Fama-French 3 0.979 0.789 0.569

Average generalized correlations of loadings estimated over weeklyand over 10 year horizon

⇒ Time-varying loadings for Fama-French factors

Difference to Fama-French factors

Intraday 0.977 0.617 0.292Daily 0.992 0.757 0.396Overnight 0.922 0.527 0.097

Generalized correlations between PCA and Fama-French factors

⇒ PCA factors and Fama-French factors have only market in common

Page 30: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Cross-sectional Asset Pricing

Risk-Premium

PCA Fama-French 3 MarketIntraday 1.595 0.356 0.297Overnight 1.649 2.001 1.362Daily 0.723 0.536 0.388

Table: Maximum Sharpe-ratio as a linear combination of factors.

Value factor earns risk-premium mainly overnight

Intraday and overnight risk premium for PCA factors has oppositesigns⇒ Lower daily risk premium

Intraday minus overnight (long-short strategy) of PCA factors hasmaximum Sharpe-ratio of 2.13

Page 31: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Cross-sectional Asset Pricing

Cross-sectional asset pricing: Market factor

-0.4 -0.2 0 0.2 0.4

Predicted return

-0.4

-0.2

0

0.2

0.4

Exp

ecte

d re

turn

Intraday (Market)

-0.4 -0.2 0 0.2 0.4

Predicted return

-0.4

-0.2

0

0.2

0.4

Exp

ecte

d re

turn

Overnight (Market)

-0.4 -0.2 0 0.2 0.4

Predicted return

-0.4

-0.2

0

0.2

0.4

Exp

ecte

d re

turn

Daily (Market)

⇒ Market factor fails to explain expected returns even withtime-variation

Page 32: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Cross-sectional Asset Pricing

Cross-sectional asset pricing: 3 Fama-French factors

-0.4 -0.2 0 0.2 0.4

Predicted return

-0.4

-0.2

0

0.2

0.4

Exp

ecte

d re

turn

Intraday (Fama-French 3)

-0.4 -0.2 0 0.2 0.4

Predicted return

-0.4

-0.2

0

0.2

0.4

Exp

ecte

d re

turn

Overnight (Fama-French 3)

-0.4 -0.2 0 0.2 0.4

Predicted return

-0.4

-0.2

0

0.2

0.4

Exp

ecte

d re

turn

Daily (Fama-French 3)

⇒ 3 Fama-French factor explain larger fraction than marketfactor

Page 33: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Cross-sectional Asset Pricing

Cross-sectional asset pricing: 4 PCA factors

-0.4 -0.2 0 0.2 0.4

Predicted return

-0.4

-0.2

0

0.2

0.4

Exp

ecte

d re

turn

Intraday (PCA)

-0.4 -0.2 0 0.2 0.4

Predicted return

-0.4

-0.2

0

0.2

0.4

Exp

ecte

d re

turn

Overnight (PCA)

-0.4 -0.2 0 0.2 0.4

Predicted return

-0.4

-0.2

0

0.2

0.4

Exp

ecte

d re

turn

Daily (PCA)

⇒ 4 PCA factors explain intraday and overnight expected returnsbetter than Fama-French factors

⇒ Asset pricing model still rejected

Page 34: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Leverage Effect

Leverage effect

Leverage effect = correlation between asset return and volatility

Nonparametric estimation of systematic and idiosyncratic leverageeffect

Two estimation approaches:

Daily implied volatilities and returnsSpot volatilities based on HF data

Main finding: systematic risk drives the leverage effect

Important consequence: Argument for risk premium explanation andagainst leverage story

Average leverage effect

I measure the leverage effect with the continuous quadratic covariation:

ρ =[σ2,X ]CT√

[X ,X ]CT

√[σ2, σ2]CT

Page 35: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Leverage Effect

Leverage effect: Componentwise

Componentwise leverage effect:

Decompose Xi and σ2i into systematic and idiosyncratic part:

X systi ,X idio

i , σ2systi and σ2idio

i

Calculate correlation for each component:

ρsyst,systi =[σ2syst

i ,X systi ]CT√

[X systi ,X syst

i ]CT

√[σ2syst

i , σ2systi ]CT

and similarly forρsyst,idioi , ρidio,systi , ρidio,idioi , ρsyst,totali , ρtotal,systi , ρtotal,totali .

Page 36: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Leverage Effect

Main Result: Componentwise leverage effect

0 100 200 300 400 500 600−0.6

−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4Componentwise leverage effect

Correlation of systematic return with total volatilityCorrelation of idiosyncratic return with total volatilityCorrelation of systematic return with idiosyncratic volatility

Cross-sectional distribution of componentwise leverage effect in 2012:

4 statistical return factors and 1 statistical volatility factor.

Page 37: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Leverage Effect

Leverage effect in 2012: Componentwise

0 100 200 300 400 500 600−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6Componentwise leverage effect

LEV(total,total)LEV(syst,total)LEV(idio,total)LEV(syst,syst)LEV(idio,idio)LEV(syst,idio)LEV(idio,syst)

Cross-sectional distribution of componentwise leverage effect in 2012:

4 statistical return factors and 1 statistical volatility factor.

Page 38: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Conclusion

Conclusion

Methodology

Asymptotic statistics for a factor model of high dimension based ongeneral continuous time processes

Quadratic variation of stochastic processes substitutes thecovariance in the principal component analysis

Truncated quadratic variation estimates continuous and jumpcovariance matrix, which identifies the systematic jump risk factors

Empirical Results

Stable continuous factor structure ⇒ First three factors are market,oil and finance; 4th factor potentially electricity

1 stable jump market factor

Decomposition of leverage effect⇒ negative correlation mainly due to systematic part

Page 39: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Differences to Long-Horizon Factor Models

After rescaling the increments, we can interpret the quadraticcovariation estimator as a sample covariance estimator:

M∑j=1

(∆jXi )2 =

T

M

M∑j=1

(∆jXi√

∆M

)2

But limiting object will be a random variable!

⇒ Path-wise instead of population arguments

Jumps lead to “heavy-tailed rescaled increments”

⇒ Cannot be accommodated in long-horizon model

Asymptotic distribution have a mixed Gaussian limit

⇒ Generally have heavier tails than a normal distribution.⇒ Need stronger mode of convergence (stable convergence in

law) for confidence intervals

Non-stationarity in stochastic volatility or stochastic intensity jumpmodels

Page 40: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Identifying factors

Problem: Factors only identified up to invertible lineartransformations

Need a measure for how close vector spaces are to each other

Generalized correlations

Set of KG economic candidate factors G

Generalized correlations are the min(K ,KG ) largest eigenvalues ofthe matrix [F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]

Example: Generalized correlations {1, 1, 0} for K = KG = 3: ⇒linear combination of G can replicate 2 factors in F

⇒ Use F for calculating generalized correlations

⇒ Asymptotic confidence intervals for sum of squared correlations:Total generalized correlation: trace

([F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]

).

Page 41: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Data cleaning

Data

For each year take the intersection of stocks traded each day withthe stocks that have been in the S&P500 index at any point during1993-2012

Stock elimination criteria:Stock is dropped if any of the following conditions is true:

All first 10 5-min observations are missing in any of the dayThere are in total more than 50 missing values before the firsttrade of each dayThere are in total more than 500 missing values in the year

Year 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Original 614 620 622 612 609 606 610 603 587 600Cleaned 446 540 564 577 585 598 608 597 581 593Dropped 27.36% 12.90% 9.32% 5.72% 3.94% 1.32% 0.33% 1.00% 1.02% 1.17%

Table: Observations after data cleaning

Page 42: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Data summary

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Percentage of increments identified as jumpsa=3 0.011 0.011 0.011 0.010 0.010 0.009 0.008 0.008 0.007 0.008a=4 0.002 0.002 0.002 0.002 0.002 0.001 0.001 0.001 0.001 0.001a=4.5 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.000 0.001

Variation explained by jumpsa=3 0.19 0.19 0.19 0.16 0.21 0.16 0.16 0.15 0.12 0.15a=4 0.07 0.07 0.07 0.05 0.10 0.06 0.06 0.06 0.03 0.05a=4.5 0.05 0.04 0.05 0.04 0.08 0.04 0.05 0.05 0.02 0.04

Percentage of jump correlation explained by first 1 jump factora=3 0.05 0.03 0.03 0.03 0.06 0.07 0.08 0.19 0.12 0.06a=4 0.03 0.02 0.02 0.04 0.08 0.06 0.08 0.25 0.09 0.08a=4.5 0.03 0.03 0.02 0.05 0.09 0.06 0.08 0.22 0.12 0.09

Percentage of continuous correlation explained by first 4 continuous factors0.26 0.20 0.21 0.22 0.29 0.45 0.40 0.40 0.47 0.31

1 Fraction of increments identified as jumps for different thresholds.

2 Fraction of variation explained by jumps for different thresholds.

3 Systematic jump correlation for different thresholds.

4 Systematic continuous correlation

Page 43: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Stability of continuous factor structure for 6 years

2007 2008 2009 2010 2011 2012

1.00 1.00 1.00 1.00 1.00 1.001.00 1.00 1.00 1.00 1.00 1.000.99 0.99 1.00 1.00 1.00 0.980.97 0.96 0.98 0.99 0.99 0.98

Generalized correlations of yearly 4 continuous factors with 4continuous factors for 2007-2012

⇒ Continuous factors are very stable over time

⇒ Same results for estimation with yearly or 6 year horizon

Page 44: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Stability of continuous factor structure for 10 years

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.97 0.99 0.99 0.99 1.00 1.00 1.00 1.00 0.99 0.990.95 0.97 0.98 0.99 0.99 0.99 0.97 0.98 0.99 0.980.47 0.63 0.17 0.67 0.99 0.99 0.94 0.92 0.97 0.96

Generalized correlations of yearly 4 continuous factors with 4continuous factors for 2003-2012

⇒ 4th continuous factor disappears in 2003-2006

Page 45: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Stability of continuous factor structure for 1 year

1 2 3 4 5 6 7 8 9 10 11 12

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.99 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.99 0.99 0.99 0.99 0.99 1.00 1.00 0.99 0.99 1.00 0.99 0.990.98 0.93 0.99 0.97 0.98 0.98 0.98 0.99 0.99 0.96 0.90 0.96

Generalized correlation of monthly continuous factors with yearlycontinuous factors in 2011.

The yearly number of factors is K = 4.

⇒ Continuous factors are highly stable within a year horizon

⇒ Same results for estimation with yearly or monthly horizon

Page 46: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Identification of factors

Continuous factors

Intuition: Identify pattern in the loadings

4 economic candidate factors:

Market (equally weighted)Oil and gas (40 equally weighted assets)Banking and Insurance (60 equally weighted assets)Electricity (24 equally weighted assets)

Page 47: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Identification of factors

2007-2012 2007 2008 2009 2010 2011 2012

1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.97 0.99 0.97 0.98 0.930.95 0.91 0.95 0.95 0.93 0.94 0.900.80 0.87 0.78 0.75 0.75 0.80 0.76

Generalized correlation of market, oil, finance and energy factors withfirst four largest statistical factors for 2007-2012

⇒ Stable continuous factor structure

Page 48: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Identification of factors

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.97 0.99 1.00 1.00 0.99 0.97 0.98 0.96 0.98 0.950.57 0.75 0.77 0.89 0.85 0.92 0.95 0.92 0.93 0.830.10 0.23 0.16 0.35 0.82 0.74 0.72 0.68 0.78 0.78

Generalized correlation of market, oil, finance and energy factors withfirst four largest statistical factors for 2003-2012

⇒ Finance factor disappears in 2003-2006

Page 49: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Robustness of identification factors: Daily data

Fama-French factors do not explain statistical factors

Use CRSP daily excess returns to form factor portfolios for2007-2012:

4 statistical factors based on continuous loadings4 economic industry portfolios4 Fama-French Carhart Factors

Generalized correlations based on daily data

Generalized correlations with 4 economic industry factors1.00 0.97 0.92 0.79

Generalized correlations with 4 Fama-French Carhart Factors0.95 0.74 0.60 0.00

⇒ Fama-French factors do not explain statistical factors

Page 50: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Identification of factors: Statistical test

4 statistical and 3 economic 4 statistical and 4 economicˆρ SD 95% CI ˆρ SD 95% CI

2007-2012 2.72 0.001 (2.71, 2.72) 3.31 0.003 (3.30, 3.31)2007 2.55 0.06 (2.42, 2.67) 3.21 0.01 (3.19, 3.22)2008 2.66 0.08 (2.51, 2.81) 3.18 0.29 (2.62, 3.75)2009 2.86 0.10 (2.67, 3.05) 3.42 0.15 (3.14, 3.71)2010 2.80 0.04 (2.72, 2.88) 3.38 0.01 (3.37, 3.39)2011 2.82 0.00 (2.82, 2.82) 3.47 0.06 (3.35, 3.58)2012 2.62 0.03 (2.56, 2.68) 3.25 0.01 (3.24, 3.26)

Total generalized correlations (=sum of squared generalizedcorrelations) with standard deviations and confidence intervals

3 economic factors (market, oil and finance) and 4 economic factors(additional electricity factor).

Values of 3 respectively 4 mean perfect replication

⇒ Reject hypothesis of perfect replication

Page 51: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of total factors

2 4 6 8 10 12 14 16 18 201

1.2

1.4Pe

rturb

ed E

RPerturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

201220112010Critical value

200920082007Critical value

2006200520042003Critical value

Number of total factors

Page 52: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Identification of jump factors

Disontinuous factors

Jump factors are different from continuous factors

Results for jump factors depend on jump threshold, while results forcontinuous factors are robust to this choice

Only 1 stable factor: equally weighted market jump factor

The higher the jump threshold, the less stable the jump factorstructure

Page 53: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Interpretation of jump factors

2007-2012 2007 2008 2009 2010 2011 2012

a=3 1.00 1.00 1.00 0.99 1.00 1.00 1.000.85 0.95 0.62 0.86 0.81 0.86 0.830.61 0.77 0.40 0.76 0.31 0.61 0.590.21 0.10 0.22 0.50 0.10 0.20 0.28

a=4 0.99 0.99 0.95 0.94 1.00 0.99 0.990.74 0.53 0.41 0.59 0.90 0.53 0.570.31 0.35 0.29 0.44 0.39 0.35 0.420.03 0.19 0.20 0.09 0.05 0.14 0.16

a=4.5 0.99 0.99 0.91 0.91 1.00 0.98 0.990.75 0.54 0.41 0.56 0.93 0.55 0.750.29 0.35 0.30 0.40 0.68 0.38 0.290.05 0.18 0.22 0.04 0.08 0.03 0.05

Table: Generalized correlations of market, oil, finance and electricity jumpfactors with first 4 jump factors from 2007-2012

Page 54: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of jump factors

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

201220112010Critical value

200920082007Critical value

2006200520042003Critical value

Number of jump factors with truncation level a = 3.

Page 55: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of jump factors

2 4 6 8 10 12 14 16 18 201

1.2

1.4Pe

rturb

ed E

R

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

201220112010Critical value

200920082007Critical value

2006200520042003Critical value

Number of jump factors with truncation level a = 4.

Page 56: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of jump factors

2 4 6 8 10 12 14 16 18 201

1.2

1.4Pe

rturb

ed E

R

Perturbed Eigenvalue Ratio

201220112010Critical value

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

200920082007Critical value

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2006200520042003Critical value

Number of jump factors with truncation level a = 4.5.

Page 57: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Stability of jump factor structure

2007 2008 2009 2010 2011 2012

a=3

1.00 1.00 1.00 1.00 1.00 1.000.96 1.00 0.95 0.98 0.76 0.850.81 0.88 0.84 0.69 0.59 0.700.12 0.81 0.14 0.25 0.05 0.17

a=4

1.00 1.00 0.99 1.00 0.99 0.990.66 0.99 0.63 1.00 0.51 0.430.34 0.52 0.09 0.97 0.43 0.200.14 0.03 0.05 0.17 0.13 0.03

a=4.5

0.99 0.99 0.98 1.00 0.99 0.990.79 0.97 0.77 1.00 0.40 0.490.28 0.44 0.28 0.96 0.26 0.240.05 0.16 0.01 0.53 0.11 0.07

Table: Generalized correlations of 4 largest yearly jump factors with 4jump factors for 2007-2012

Page 58: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Comparison of factor portfolios based on HF data anddaily returns

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.99 0.97 0.99 0.99 0.99 1.00 0.99 0.98 0.99 0.990.98 0.94 0.94 0.97 0.95 0.98 0.98 0.98 0.98 0.970.55 0.65 0.83 0.17 0.47 0.76 0.98 0.96 0.93 0.93

Generalized correlations between continuous factors based oncontinuous data and daily data

Continuous factors are constructed based on loadings from HF anddaily

⇒ Daily returns estimate similar but noisier loadings and factors ascontinuous HF data.

Page 59: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of factors with daily CRSP returns

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

201220112010Critical value

200920082007Critical value

2006200520042003Critical value

Number of factors based on daily CRSP returns

Page 60: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Estimating volatility

Estimating volatility with realized quadratic variation

Use quadratic variation for short horizon (e.g. a day ) as estimatorfor real-world volatility

Estimating implied volatility with option data

Use at-the-money implied Black-Scholes volatility for short maturityoptions as estimator for risk-neutral volatility

Interpolate volatility surface to obtain theoretical at-the-money andshort maturity implied volatility

Which volatility estimator?

For factor analysis and leverage effect real-world and risk-neutralvolatility essentially equivalent

Implied volatility more reliable for volatility of volatility and leverageeffect estimation

Page 61: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Data

Data

Daily prices for standard call and put options from OptionMetrics

Same firms and time period as for HF data

Implied volatility for 30 days at-the-money options usinginterpolated volatility surface

Average the implied call and put volatilities

More robust than generalized implied volatility

Data cleaning: Remove the volatilities greater than 200% of theaverage of a 31 days moving window centered at the day

Year 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Original 408 479 507 525 543 557 565 561 549 565Cleaned 399 465 479 495 508 528 536 530 523 529Dropped 2.21% 2.92% 5.52% 5.71% 6.45% 5.21% 5.13% 5.53% 4.74% 6.37%

Page 62: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of volatility factors

2 4 6 8 10 12 14 16 18 201

1.2

1.4Pe

rturb

ed E

RPerturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

2 4 6 8 10 12 14 16 18 201

1.2

1.4

Pertu

rbed

ER

Perturbed Eigenvalue Ratio

201220112010Critical value

200920082007Critical value

2006200520042003Critical value

Number of volatility factors

Page 63: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Identification of volatility factors

Time-variation and identification of volatility factors

2007 2008 2009 2010 2011 2012

1.00 1.00 1.00 1.00 1.00 1.000.19 0.90 0.92 0.33 0.67 0.280.07 0.34 0.13 0.06 0.11 0.050.01 0.05 0.00 0.00 0.01 0.01

Table: Generalized correlation of market, oil and finance volatility factorswith the first 4 largest statistical volatility factors

⇒ 1 stable market factor and 1 temporary finance factor

Page 64: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Leverage effect in 2012: Componentwise

0 100 200 300 400 500 600−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6Componentwise leverage effect

LEV(total,total)LEV(syst,total)LEV(idio,total)LEV(total,total) (HF)LEV(syst,total) (HF)LEV(idio,total) (HF)

Figure: Componentwise leverage effect in 2012 based on implied andhigh-frequency volatilities. 4 continuous asset factors.

Page 65: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Leverage effect in 2012: Componentwise

0 100 200 300 400 500 600−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6Componentwise leverage effect

LEV(total,total) (cont)LEV(syst,total) (cont)LEV(idio,total) (cont)LEV(total,total) (day)LEV(syst,total) (day)LEV(idio,total) (day)

Figure: Componentwise leverage effect in 2012 with daily continuous logprice increments LEV (cont) and daily returns LEV (day) and 4 assetfactors.

Page 66: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Leverage effect in 2012: Componentwise

0 100 200 300 400 500 600−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6Componentwise leverage effect

LEV(total,total)LEV(syst,total) (stat)LEV(idio,total) (stat)LEV(syst,total) (FFC)LEV(idio,total) (FFC)

Figure: Componentwise leverage effect in 2012 with 4 continuous dailyfactors LEV (stat) or 4 Fama-French-Carhart factors LEV (FFC ).

Page 67: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Reversal Effect

-0.5 0 0.5

Expected return

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

Exp

ecte

d in

trad

ay r

etur

n

Overnight reversal

⇒ Expected overnight and intraday returns are stronglynegatively correlated

Page 68: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Predicted Reversal Effect

-0.5 0 0.5

Predicted overnight return

-0.5

0

0.5

Pre

dict

ed in

trad

ay r

etur

n

PCA

-0.5 0 0.5

Predicted overnight return

-0.5

0

0.5

Pre

dict

ed in

trad

ay r

etur

n

Fama-French 3

-0.5 0 0.5

Predicted overnight return

-0.5

0

0.5

Pre

dict

ed in

trad

ay r

etur

n

Market

⇒ PCA and Fama-French factors predict reversal relationship inexpected returns

Page 69: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Assumptions

Assumption 1: Weak dependence of error terms

The largest eigenvalue of the residual quadratic covariation matrix isbounded in probability, i.e.

λ1([e, e]) = Op(1).

Define the instantaneous predictable quadratic covariation as

d〈ei , ek〉tdt

=: Gi,k(t).

Largest eigenvalue of the matrix G (t) is almost surely bounded forall t:

λ1(G (t)) < C a.s. for all t for some constant C .

⇒ Approximate factor structure: Residual risk can be diversified

Page 70: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Factor structure assumptions

Assumption 2: Weaker dependence of error terms

The row sum of the quadratic covariation of the residuals is bounded inprobability:

N∑i=1

‖[ek , ei ]‖ = Op(1) ∀k = 1, ...,N

⇒ Stronger than bounded eigenvalues in Assumption 1.

Page 71: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Consistency of loadings

Theorem 1: Consistency of estimators

Define the rate δ = min(N,M). Then for δ →∞1 Consistency of loadings estimator: Under Assumption 1

Λi − H>Λi = Op

(1√δ

).

2 Consistency of factor estimator and common component(CT ,i = FTΛi and CT ,i = FT Λi ): Under Assumptions 1 and 2

FT − H−1FT = Op

(1√δ

), CT ,i − CT ,i = Op

(1√δ

).

⇒ Need both N and M to go to ∞⇒ Curse of dimensionality turns into “ blessing”.

⇒ Full-rank matrix H known

Page 72: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Separating jumps from continuous movements

Separating continuous and jump part

Intuition: Large increments = jumps

Truncation estimator for continuous part:

XCj,i = Xj,i1{|Xj,i |≤aj,i∆ω}

instead of Xj,i with aj,i > 0 and ω ∈ (0, 12 )

Truncation estimator for jump part:

XDj,i = Xj,i1{|Xj,i |>aj,i∆ω}

Set aj,i∆ω = a · σC

j,i ·∆0.49 with σCj,i local window estimator of

volatility and a = 3, 4 and 4.5

Page 73: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Estimation of jump and continuous risk

Estimate jump and continuous factors

X (t) = ΛCFC (t) + ΛDFD(t) + e(t).

Truncation estimator

Use truncation estimator to construct estimators for

[X ,X ]︸ ︷︷ ︸total risk covariance

[X ,X ]C︸ ︷︷ ︸continuous risk covariance

[X ,X ]D︸ ︷︷ ︸jump risk covariance

Apply PCA to XC>XC

N ⇒ ΛC and FC

Apply PCA to XD>XD

N ⇒ ΛD and FD

⇒ Estimators ΛC , ΛD , FC and FD are defined analogously to Λ and F ,but using XC and XD instead of X .

Page 74: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Assumptions

Assumption 3: Truncation identification

F and ei have only finite activity jumps

Factor jumps are not “hidden” by idiosyncratic jumps:

P(∆Xi (t) = 0 if ∆(Λ>i F (t)) 6= 0 and ∆ei (t) 6= 0

)= 0.

(Always satisfied as soon as Levy measure of F and e have adensity)

Continuous [FC ,FC ] and jump [FD ,FD ] covariation matrices and

limN→∞ΛC>

ΛC

N and limN→∞ΛD>

ΛD

N are full rank.(Systematic jump factors need to jump in [0,T ] and affect manyassets.)

Page 75: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Separating jumps from continuous movements

Theorem 2: Separating continuous and jump factors

Assumptions 1 and 3 hold and δ →∞:

1 The continuous and jump loadings can be estimated consistently:

ΛCi = HC>ΛC

i + op(1) , ΛDi = HD>ΛD

i + op(1).

2 Additionally Assumption 2. The continuous and jump factors canonly be estimated up to a finite variation bias term

FCT = HC−1

FCT + op(1) + finite variation term

FDT = HD−1

FDT + op(1) + finite variation term.

3 Additionally Assumption 2. Consistent estimation of the covariation

with any Ito-semimartingale Y (t) for√MN → 0:∑M

j=1 FCj Yj = HC−1

[FC ,Y ]T + op(1)∑Mj=1 F

Dj Yj = HD−1

[FD ,Y ]T + op(1)

Page 76: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of factors

Estimating number of factors

Intuition: Number of factors = number of large eigenvalues

Denote the ordered eigenvalues of X>X by λ1 ≥ ... ≥ λN .

Under Assumption 1 and for O(NM

)≤ O(1):

systematic eigenvalues λ1, ..., λK = Op(N)non-systematic eigenvalues λk = Op(1) for k = K + 1, ...,N.

Possible estimation strategy: Look at at eigenvalue ratios λk

λk+1

⇒ Problem: non-systematic eigenvalues not bounded from below

Popular estimators with good performance use arguments ofrandom matrix theory to obtain

1 lower bounds on non-systematic eigenvalues2 clustering of non-systematic eigenvalues

Random matrix theory assumptions too restrictive for our model

Page 77: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of factors

Theorem 3: Number of factors

Assumption 1 and O(NM

)≤ O(1):

Choose sequence g(N,M) s.t g(N,M)N → 0 and g(N,M)→∞.

Define perturbed eigenvalues and eigenvalue ratio statistics:

λk = λk + g(N,M) ERk =λk

λk+1

for k = 1, ...,N − 1

⇒ ERk clusters around 1 for k = K + 1, ...,K and explodes for k = K

The estimator for the number of factors is for any γ > 0:

K (γ) = max{k ≤ N − 1 : ERk > 1 + γ}

⇒ K (γ)p→ K for any γ > 0.

Additionally Assumption 3: Consistent estimation of KC and KD

Page 78: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Number of factors

Number of factors

Two tuning parameters

Perturbation: I recommend g(N,M) =√N ·

median(λ1, ..., λN).(Results very robust to choice of perturbation)Cut-off: γ between 0.05 and 0.2

My estimator focuses on residuals spectrum:

⇒ Robust to strong and weak factors

Perturbation avoids random matrix theory

⇒ Weaker assumptions and estimation of continuous and jumpfactors possible

Page 79: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations

Number of factors

K = 3 factors

Dominant first factor: σdominant =√

10

Large signal-to-noise ratio θ = 6

Cross-sectional correlation: Toplitz matrix A with parameters{1, 0.5, 0.5, 0.5, 0.52}.

Toy model to make it comparable with other estimators that requirestronger assumptions:

X (t) = ΛWF (t) + θAWe(t)

Page 80: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations

Number of factors

ERP1: Perturbation g(N,M) =√N ·median{λ1, ..., λN}, cutoff

γ = 0.2

ERP2: Perturbation g(N,M) = logN ·median{λ1, ..., λN}, cutoffγ = 0.2

Onatski (2010): Clustering in eigenvalue-difference

Ahn and Horenstein (2013): Maximum in eigenvalue ratio

Bai and Ng (2002): Information criterion

Page 81: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations: Number of Factors

20 40 60 80 100 120 140 160 180 200 2200

0.5

1

1.5

2

2.5

3

3.5

4RM

SE

Error in estimating the number of factors

N,M

ER perturbed 1ER perturbed 2OnatskiAhnBai

Figure: RMSE (root-mean squared error) for the number of factors fordifferent estimators with N = M.

Page 82: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations: Number of Factors

ERP1 ERP2 Onatski Ahn Bai

RMSE 0.32 0.18 0.49 4.00 3.74Mean 2.79 2.88 2.76 1.00 1.09Median 3 3 3 1 1SD 0.52 0.41 0.66 0.00 0.28Min 1 1 1 1 1Max 3 4 5 1 2

Table: N = M = 125.

Page 83: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations

Simulation parameters

Heston-type stochastic volatility model with jumps.

K factors are modeled as

dFk(t) =(µ− σ2Fk

(t))dt + ρFσFk(t)dWFk

(t)

+√

1− ρ2FσFk

(t)dWFk(t) + JFk

dNFk(t)

dσ2Fk

(t) =κF(αF − σ2

Fk(t))dt + γFσFk

(t)dWFk(t)

N residual processes follow similar dynamics

Brownian motions WF , WF ,We , We independent.

Standard parameter values: κF = κe = 5, γF = γe = 0.5,ρF = −0.8, ρe = −0.3, µ = 0.05, αF = αe = 0.1, T = 1.

Compound Poisson process with intensity νF = νe = 6 and normallydistributed jumps with JFk

∼ N(−0.1, 0.5) and Jei ∼ N(0, 0.5).

Page 84: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations

Simulation parameters

Factors and residuals follow Heston-type stochastic volatility modelwith jumps

Standard parameters for dynamics

X (t) = ΛF (t) + θAe(t)

Toplitz matrix A captures cross-sectional dependenceθ measures signal to noise ratio

Strong and weak factors: First factor scaled up by σdominant

Consistency: Measure correlation of estimated factor with truefactor for K = 1

Page 85: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations

Simulation parameters

Asymptotic distribution:

CLTC =

(1

NVT ,i +

1

MWT ,i

)−1/2 (CT ,i − CT ,i

)CLTF =

√NΘ−1/2F (FT − H−1FT )

CLTΛ =√MΘ

−1/2Λ,i (Λi − H>Λi )

Theory predicts standard normal distribution N(0, 1)

Page 86: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations: Consistency

N=200, M=250 Case 1 Case 2 Case 3

Total Continuous JumpCorrelation Factors 0.994 0.944 0.972 0.997 0.997SD Factors 0.012 0.065 0.130 0.001 0.000Correlation Loadings 0.995 0.994 0.975 0.998 0.998SD Loadings 0.010 0.008 0.127 0.001 0.000

Case 1: Stochastic volatility with jumps and residual cross-sectionalcorrelation

Case 2: Stochastic volatility and residual cross-sectional correlation

Case 3: Brownian motions only

Page 87: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations: Asymptotic Distribution

−5 0 50

20

40

60

80Common components

−5 0 50

20

40

60

80

100

120Factors

−5 0 50

20

40

60

80Loadings

Figure: N = 200 and M = 250. Histogram of standardized commoncomponents CLTC , factors CLTF and loadings CLTΛ. The normal densityfunction is superimposed on the histograms. Stochastic volatility withjumps and residual cross-sectional correlation.

Page 88: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Simulations: Number of Factors

2 4 6 8 10 12 141

2

3

k

ER p

ertu

rbed

Total number of factors

2 4 6 8 10 12 1412345

k

ER p

ertu

rbed

Total number of continuous factors

2 4 6 8 10 12 14

2

4

6

k

ER p

ertu

rbed

Total number of jump factors

Figure: Perturbed eigenvalue ratios (ERP1) for Heston-type stochasticvolatility model with K = 3, KC = 3, KD = 1, σdominant = 3, N = 200and M = 250 for 100 simulated paths.

Page 89: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Asymptotic distribution

Key idea: Asymptotic expansion

Loadings:

√M(

Λi − H>Λi

)= V−1

MN

(Λ>Λ

N

)√MF>ei + Op

(√M

δ

)⇒ Apply asymptotic distribution results of quadratic covariation

estimator to√MF>ei

Factors:

√N(FT − H−1FT

)=

1√NeTΛH + OP

(√N√M

)+ Op

(√N

δ

)⇒ Apply martingale central limit theorem to 1√

NeTΛH

We obtain stable convergence in law (stronger than convergence indistribution) and a mixed-Gaussian limit

Page 90: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Asymptotic distribution of loadings

Theorem 4: Asymptotic distribution of loadings

Assumptions 1 and 2 hold. Then for δ = min(N,M)→∞:

⇒√M(

Λi − H>Λi

)= V−1

NM

(Λ>ΛN

)√MF>ei + Op

(√Mδ

)If√MN → 0:

⇒√M(Λi − H>Λi )

L−s−→ N(0,V−1QΓiQ

>V−1)

Γi =∫ T

0σ2Fσ

2eids +

∑s≤T

∆F 2(s)σ2ei (s) +

∑s′≤T

∆e2i (s ′)σ2

F (s ′)

V diagonal matrix of eigenvalues of Σ12

ΛΣFΣ12

Λ

plimN,M→∞

Λ>ΛN = Q

Page 91: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Assumptions for asymptotic distribution of factors:

Assumption 4: Asymptotically negligible jumps of error terms

Jumps of the martingale 1√N

∑Ni=1 ei (t) are asymptotically negligible:

Λ>[e, e]tΛ

N

p→ 〈Z ,Z 〉t ,Λ>〈eD , eD〉tΛ

N

p→ 0 ∀t > 0.

with Z some continuous square integrable martingale with quadraticvariation 〈Z ,Z 〉t .

Assumption 5: Even weaker dependence of error terms

weak serial dependence of error terms

weaker cross-sectional dependence error terms

Page 92: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Asymptotic distribution of the factors

Theorem 5: Asymptotic distribution of the factors

Assumptions 1-2 hold. Then for δ = min(N,M)→∞

⇒√N(FT − H−1FT ) = 1√

NeTΛH + OP

(√N√M

)+ Op

(√Nδ

)If Assumptions 4 and 5 hold and

√N

M → 0 or only Assumption 4

holds and NM → 0:

⇒√N(FT − H−1FT )

L−s−→ N(

0,Q−1>ΦTQ−1)

ΦT = plimN→∞

Λ>[e,e]ΛN and Q−1 = limδ→∞ H

Page 93: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Asymptotic distribution of the common components

Theorem 6: Asymptotic distribution of the common components

Define CT ,i = Λ>i FT and CT ,i = Λ>i FT . Assumptions 1-4 hold.

1 If Assumption 5 holds, then for any sequence N,M

√δ(CTi − CTi

)√δ√NWT ,i +

√δ√MVT ,i

D→ N(0, 1)

2 Assume NM → 0 (but we do not require Assumption 5)

√N(CT ,i − CT ,i

)√

WT ,i

D→ N(0, 1)

with WT ,i = Λ>i Σ−1Λ ΦTΣ−1

Λ Λi and VT ,i = F>T Σ−1F ΓiΣ

−1F FT

Page 94: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Estimation of Covariance Matrices

Feasible estimators of asymptotic covariance matrices

Loading estimator

Feasible estimation possible under Assumption 1 and 2Estimation based on local volatility estimation

Factors

Dimensionality problem: Need estimator of N ×N matrix [e, e]Stronger assumptions necessary, e.g. cross-sectionalindependence, parametric dependency or sparsity

Page 95: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Identifying factors

Problem: Factors only identified up to invertible lineartransformations

Need a measure for how close vector spaces are to each other

Test for comparing two sets of factors

Set of KG economic candidate factors G

Generalized correlations are the min(K ,KG ) largest eigenvalues ofthe matrix [F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]

F and G describe the same factor model if the generalizedcorrelations are all equal to 1

Test H0: Sum of squared generalized correlations equal tomin(K ,KG )

Example: Generalized correlations {1, 1, 1} for K = KG = 3 ⇒same factor space

Page 96: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Identifying factors

Total generalized correlation

Total generalized correlation (sum of squared correlations)

ρ = trace([F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]

).

Estimator for the total generalized correlation

ˆρ = trace(

(F>F )−1(F>G )(G>G )−1(G>F )).

Trace operator is a differentiable function and the quadraticcovariation estimator is asymptotically mixed-normally distributed⇒ Delta-method argument.

Theorem 7: Asymptotic distribution of ˆρ

Assumption 1 and√MN → 0. Weak assumptions on G. Then for δ →∞:

√M(

ˆρ− ρ) L−s→ N(0,Ξ) and

√M√Ξ

(ˆρ− ρ

) D→ N(0, 1)

Page 97: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Microstructure noise

Assumption 6: Microstructure noise

We observe the true asset price with noise εj,i :

Yi (j) = Xi (tj) + εj,i

Assume that the noise ε is i.i.d., independent of X and has bounded4th moments.

The noise creates serial correlation in increments

The noise does not vanish for M →∞

Solution: Sparse sampling (e.g. 5 min)

Denote increments of noise by εj,i = εj+1,i − εj,i

Page 98: Large Dimensional Factor Modeling Based on High-Frequency … · 2018-10-01 · Large Dimensional Factor Modeling Based on High-Frequency Observations. Markus Pelger Stanford University

Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix

Microstructure noise

Theorem 8: Upper bound on impact of noise

Assumption 1 and NM → c < 1. Then for N,M →∞:

Upper bound on the impact of noise on largest residual eigenvalue:

λ1

((e + ε)>(e + ε)

N

)− λ1

(e>e

N

)≤ 2λmedian

(1 +√c

1−√c

)2

+ op(1)

Variance of microstructure noise is bounded by

σ2ε ≤

c

2(1−√c)2· λmedian + op(1).

where λmedian denotes the median eigenvalue of(

Y>YN

).

Empirical upper bound on noise impact is small ⇒ noise does not affect

the spectrum with 5 min sampling