Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Large Dimensional Factor Modeling Based onHigh-Frequency Observations.
Markus Pelger
Stanford UniversityDepartment of Management Science & Engineering
Market Microstructure and High-Frequency DataJune 3, 2017
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Motivation
Systematic pattern in high-frequency data
0 5 10 15 20 25 30 35 40 45−3.5
−3
−2.5
−2
−1.5
−1
−0.5
0
0.5Log−prices of financial firms
time in days
log−
pric
e
Log-price for financial institutions of the S&P500 for 2 months after September
16th 2008
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Motivation
Systematic pattern in jumps of high-frequency data
0 5 10 15 20 25 30 35 40 45−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4Jumps in log−prices of financial firms
time in days
log−
pric
e
Estimated jumps of log-price for financial institutions of the S&P500 for 2
months after September 16th 2008
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Motivation
Systematic pattern for unobservable spot volatility
0 50 100 150 200 250 3000
50
100
150Spot volatility
days
spot
vol
atilit
y
Estimated spot volatility for firms in the S&P500 for 2008
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Motivation
Motivation: Research questions
Understanding systematic risk with factors
Goal of this paper: Using high-frequency data to better understandsystematic factor risk
Key elements of this paper:
1 Statistical factors instead of pre-specified (and potentiallymiss-specified) factors
2 Analyze time-variation in systematic factors without imposingrestrictions on time-dependency
Research questions:
1 How many systematic factors?2 What are the factors?3 How stable is the factor structure?4 Are continuous factors different from jump factors?5 Do factors explain the cross-section of asset returns?6 Leverage effect due to systematic or idiosyncratic risk?
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Motivation
This paper: Statistical theory for answering questions
Large-dimensional high-frequency factor analysis
Estimating factors, loadings and number of factors for large numberof cross-sectional and high-frequency observations
Estimate unknown factors with principal component analysis
High-frequency data: Estimate factors for different short timehorizons independently
⇒ analyze time-variation in factors
Approximate factor structure imposes only weak assumptions
Empirical application to U.S. equity market
5-minutes prices for the S&P500 firms from 2003 to 2012Daily implied volatilities for the S&P500 firms from 2003 to2012
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Motivation
Contribution of the Estimation Theory
Contributions (Theory)
1 Approximate factor analysis: Extend distribution theory of largedimensional factor analysis to high-frequency observations(Apply PCA to quadratic covariation instead of covariance)
2 Develop estimation strategy for continuous and jump factors(Truncated quadratic covariation matrix)
3 New estimator for number of factors(Perturbed eigenvalue ratio statistic)
4 New test for comparing different sets of factors(Sum of generalized correlation statistic)
⇒ Combining high-frequency econometrics, principal componentanalysis, large dimensional factor modeling
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Empirical results
Contribution of the Empirical Application
Contributions (empirical)
1 Continuous factors: Stable factor structure
4 continuous factors for 2007-2012: Market, oil, finance andelectricity3 continuous factors for 2003-2006: Market, oil and electricity
2 Jump factors
1 stable jump factor: Market
3 Cross-sectional asset pricing:
Intraday factors explain intraday expected returns
4 Decomposing the leverage effect:
Negative correlation of return and volatility due to systematicrisk
⇒ Rules out financial leverage story
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Literature
Literature (partial list)
Factor models for unknown factors for long-horizons
Bai et al. (2002, 2003): High-dimensional factor modelsOnatski (2010): Determining the number of factorsFan et al. (2013): Sparse matrices in factor modeling
High-frequency econometrics
Bollerslev and Todorov et al. (2010, 2016): Continuous andjump market risk are differentJacod (2008): Asymptotic properties of functionals ofsemimartingalesAıt-Sahalia and Jacod (2009), Lee and Mykland (2008),Mancini (2009): Estimating jumps
Large dimensional high-frequency factor modeling
Aıt-Sahalia and Xiu (2016): Sparsity assumption forcontinuous riskFan, Furger and Xiu (2014): Known factors
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
High-frequency factor analysis: Setup
High-frequency factor analysis
High-frequency factor analysis: Setup
True log-price process in continuous time
Xi (t) = Λi1×K︸︷︷︸
loadings
F (t)K×1︸︷︷︸factors
+ ei (t)︸︷︷︸idiosyncratic
i = 1, ...,N t ∈ [0,T ]
Observed log-price process at discrete time points
Xi (tj) = ΛiF (tj)︸ ︷︷ ︸systematic
+ ei (tj)︸ ︷︷ ︸non−systematic
for j = 1, ...,M withT
M= tj+1 − tj
N assets (large)time horizon T (fixed)M high-frequency observations (large)K systematic factors (fixed)
Λ, F (t) and e(t) are unknown
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
High-frequency factor analysis: Setup
Quadratic covariation as a replacement for covariance
Conventional factor analysis
analyze principal components (eigenvalues) of covariance matrix
Cov(X ) = 1T
∑Tt=1(X (t)− X )2 with X = 1
T
∑Tt=1 X (t)
covariance matrix cannot be estimated for fixed T
⇒ replace covariance by quadratic covariation
Quadratic covariation process: Sum of squared increments
X (t) is semimartingale.
Partition [0,T ] into M subintervals with mesh size ∆M = TM → 0
M∑j=1
(X (tj+1)− X (tj))2 p−→ [X ,X ]T︸ ︷︷ ︸quadratic variation
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
High-frequency factor analysis
Estimation approach
Estimation approach
M observations of the process X in the time interval [0,T ]
M →∞ and N →∞ and T fixed
sampling frequency ∆M = TM = tj+1 − tj → 0
Xj,i = Xtj+1,i − Xtj ,i Fj = Ftj+1 − Ftj ej,i = etj+1,i − etj ,i
X(M×N)
= F(M×K)
Λ>(K×N)
+ e(M×N)
We want to analyse Λ and F :
VNM := K largest eigenvalues of X>XN
Λ :=√N· eigenvectors of VNM
F := X ΛN
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
High-frequency factor analysis
Approximate factor model
Assumptions for approximate factor model
Factor structure
X (t)N×1
= ΛN×K
F (t)K×1
+ e(t)N×1
F and ei are Ito-semimartingales with weak integrability conditions,F and ei are independent
ei are local martingales
Weak dependence of diversifiable risk: [e, e]T and 〈e, e〉T havebounded eigenvalues.
Identification criterion for ΣF = [F ,F ]T and ΣΛ = limN→∞Λ>ΛN :
full rank and eigenvalues of ΣΛΣF are distinct.
For some stronger results we need weaker serial and cross-sectionaldependence of the idiosyncratic terms
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
High-frequency factor analysis
Results
Main results
Assume M,N →∞
Consistent estimation of the number of factors K
Consistent estimation of Λ
Consistent estimation of F
Consistent estimation of common component Cj,i = FjΛ>i
Under additional weak technical assumptions: Asymptoticmixed-normality of estimators
⇒ Curse of dimensionality turns into a “blessing”
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
High-frequency factor analysis
Separating jumps from continuous movements
Separating continuous and jump part
Intuition: Large increments = jumps
Truncation estimator for continuous part:
XCj,i = Xj,i1{|Xj,i |≤aj,i∆ω}
instead of Xj,i with aj,i > 0 and ω ∈ (0, 12 )
Truncation estimator for jump part:
XDj,i = Xj,i1{|Xj,i |>aj,i∆ω}
Set aj,i∆ω = a · σC
j,i ·∆0.49 with σCj,i local window estimator of
volatility and a = 3, 4 and 4.5
⇒ Asymptotic results for loadings and and number of factors hold forcontinuous and jump part separately for finite activity jumps
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
High-frequency factor analysis
Estimation of jump and continuous risk
Estimate jump and continuous factors
X (t) = ΛCFC (t) + ΛDFD(t) + e(t).
Truncation estimator
Use truncation estimator to construct estimators for
[X ,X ]︸ ︷︷ ︸total risk covariance
[X ,X ]C︸ ︷︷ ︸continuous risk covariance
[X ,X ]D︸ ︷︷ ︸jump risk covariance
Apply PCA to XC>XC
N ⇒ ΛC and FC
Apply PCA to XD>XD
N ⇒ ΛD and FD
⇒ Estimators ΛC , ΛD , FC and FD are defined analogously to Λ and F ,but using XC and XD instead of X .
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
High-frequency factor analysis
Number of factors
Theorem: Number of factors
Intuition: Number of factors = number of large eigenvalues
Denote the ordered eigenvalues of X>X by λ1 ≥ ... ≥ λN .
Choose sequence g(N,M) s.t g(N,M)N → 0 and g(N,M)→∞.
I recommend g(N,M) =√N · median(λ1, ..., λN).
Define perturbed eigenvalues and eigenvalue ratio statistics:
λk = λk + g(N,M) ERk =λk
λk+1
for k = 1, ...,N − 1
The estimator for the number of factors is for any c > 0:
K (c) = max{k ≤ N − 1 : ERk > 1 + c}
⇒ Under weak assumptions it holds K (c)p→ K for any c > 0.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
High-frequency factor analysis
Identifying factors
Problem: Factors only identified up to invertible lineartransformations
Need a measure for how close vector spaces are to each other
Generalized correlations
Set of KG economic candidate factors G
Generalized correlations are the min(K ,KG ) largest eigenvalues ofthe matrix [F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]
Example: Generalized correlations {1, 1, 0} for K = KG = 3: ⇒linear combination of G can replicate 2 factors in F
⇒ Use F for calculating generalized correlations
⇒ Asymptotic confidence intervals for sum of squared correlations:Total generalized correlation: trace
([F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]
).
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Data
Price data
Data
Time period: 2003 to 2012
Xi (t) is the log-price from the TAQ database
N between 500 and 600 firms from the S&P 500
5-min sampling: on average 250 days with 77 increments each
Data cleaning:
Delete all entries with a time stamp outside 9:30am-4pmDelete entries with a transaction price equal to zeroRetain entries originating from a single exchangeDelete entries with corrected trades and abnormal salecondition.Aggregate data with identical time stamp usingvolume-weighted average prices
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Data
Log price of S&P 500 firms in 2012.
0 50 100 150 200 250 300−5
0
5Log stock price in 2012
days
log
pric
e
0 50 100 150 200 250 300−1
0
1
2Systematic component
days
log
pric
e
0 50 100 150 200 250 300−5
0
5Idiosyncratic component
days
log
pric
e
Decomposing prices into systematic and idiosyncratic component, 2012, K = 4.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Data
Continuous part of log price of S&P 500 firms in 2012.
0 50 100 150 200 250 300−5
0
5Log stock price in 2012
days
log
pric
e
0 50 100 150 200 250 300−1
0
1
2Systematic component
days
log
pric
e
0 50 100 150 200 250 300−5
0
5Idiosyncratic component
days
log
pric
e
Decomposing continuous movements in 2012 for K = 4 and a = 3.
Continuous movements are 85% of total variation.
Systematic part is 32% of continuous variation.
99.2% of movements continuous.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Data
Discontinuous part of log price of S&P 500 firms in 2012.
0 50 100 150 200 250 300−5
0
5Log stock price in 2012
days
log
pric
e
0 50 100 150 200 250 300−0.5
0
0.5
1Systematic component
days
log
pric
e
0 50 100 150 200 250 300−5
0
5Idiosyncratic component
days
log
pric
e
Decomposing discontinuous movements in 2012 for K = 1 and a = 3
Jump movements are 15% of total variation.
Systematic part is 25% of jump variation.
0.8% of movements jumps.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of continuous factors
Number of continuous factors
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
201220112010Critical value
200920082007Critical value
2006200520042003Critical value
Number of continuous factors
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of continuous factors
Identification of factors
Continuous factors
Intuition: Identify pattern in the loadings
4 economic candidate factors:
Market (equally weighted)Oil and gas (40 equally weighted assets)Banking and Insurance (60 equally weighted assets)Electricity (24 equally weighted assets)
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of continuous factors
Main result: Identification of factors
4 continuous factors with industry continuous factors1.00 0.98 0.95 0.80
4 jump factors with industry jump factors0.99 0.75 0.29 0.05
4 continuous factors with Fama-French Carhart Factors0.95 0.74 0.60 0.00
Table: Generalized correlations of first four largest statistical factors for2007-2012 with economic factors
Generalized correlations close to 1 measure of how many factors twosets have in common
Economic industry factors: Market, oil, finance, electricity
⇒ Jump structure different from continuous structure
⇒ Size, value, momentum do not explain factors
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of continuous factors
Identification of factors
2007-2012 2007 2008 2009 2010 2011 2012
1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.97 0.99 0.97 0.98 0.930.95 0.91 0.95 0.95 0.93 0.94 0.900.80 0.87 0.78 0.75 0.75 0.80 0.76
Generalized correlation of market, oil, finance and energy factors withfirst four largest statistical factors for 2007-2012
⇒ Stable continuous factor structure
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of continuous factors
Identification of factors
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.97 0.99 1.00 1.00 0.99 0.97 0.98 0.96 0.98 0.950.57 0.75 0.77 0.89 0.85 0.92 0.95 0.92 0.93 0.830.10 0.23 0.16 0.35 0.82 0.74 0.72 0.68 0.78 0.78
Generalized correlation of market, oil, finance and energy factors withfirst four largest statistical factors for 2003-2012
⇒ Finance factor disappears in 2003-2006
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Cross-sectional Asset Pricing
Cross-sectional Asset Pricing
Arbitrage Pricing Theory (APT)
APT (Ross (1976), Chamberlain (1988), Reisman (1988)):Expected excess return is explained by systematic risk:
E [Xi ] = E [F ]Λi
Dynamic APT applies that expected intraday returns should beexplained by systematic intraday risk (same for overnight)
Dynamic APT allows for time-varying loadings
Data
Expected returns from 2003 to 2012 for N = 304 stocks
Intraday, overnight and daily factors based on 4 continuous factors
Intraday, overnight and daily 3 Fama-French (market, size, value)
Weekly estimation of loadings
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Cross-sectional Asset Pricing
Time-variation in loadings
Time-variation in loadings
PCA 0.983 0.953 0.924 0.827Fama-French 3 0.979 0.789 0.569
Average generalized correlations of loadings estimated over weeklyand over 10 year horizon
⇒ Time-varying loadings for Fama-French factors
Difference to Fama-French factors
Intraday 0.977 0.617 0.292Daily 0.992 0.757 0.396Overnight 0.922 0.527 0.097
Generalized correlations between PCA and Fama-French factors
⇒ PCA factors and Fama-French factors have only market in common
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Cross-sectional Asset Pricing
Risk-Premium
PCA Fama-French 3 MarketIntraday 1.595 0.356 0.297Overnight 1.649 2.001 1.362Daily 0.723 0.536 0.388
Table: Maximum Sharpe-ratio as a linear combination of factors.
Value factor earns risk-premium mainly overnight
Intraday and overnight risk premium for PCA factors has oppositesigns⇒ Lower daily risk premium
Intraday minus overnight (long-short strategy) of PCA factors hasmaximum Sharpe-ratio of 2.13
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Cross-sectional Asset Pricing
Cross-sectional asset pricing: Market factor
-0.4 -0.2 0 0.2 0.4
Predicted return
-0.4
-0.2
0
0.2
0.4
Exp
ecte
d re
turn
Intraday (Market)
-0.4 -0.2 0 0.2 0.4
Predicted return
-0.4
-0.2
0
0.2
0.4
Exp
ecte
d re
turn
Overnight (Market)
-0.4 -0.2 0 0.2 0.4
Predicted return
-0.4
-0.2
0
0.2
0.4
Exp
ecte
d re
turn
Daily (Market)
⇒ Market factor fails to explain expected returns even withtime-variation
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Cross-sectional Asset Pricing
Cross-sectional asset pricing: 3 Fama-French factors
-0.4 -0.2 0 0.2 0.4
Predicted return
-0.4
-0.2
0
0.2
0.4
Exp
ecte
d re
turn
Intraday (Fama-French 3)
-0.4 -0.2 0 0.2 0.4
Predicted return
-0.4
-0.2
0
0.2
0.4
Exp
ecte
d re
turn
Overnight (Fama-French 3)
-0.4 -0.2 0 0.2 0.4
Predicted return
-0.4
-0.2
0
0.2
0.4
Exp
ecte
d re
turn
Daily (Fama-French 3)
⇒ 3 Fama-French factor explain larger fraction than marketfactor
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Cross-sectional Asset Pricing
Cross-sectional asset pricing: 4 PCA factors
-0.4 -0.2 0 0.2 0.4
Predicted return
-0.4
-0.2
0
0.2
0.4
Exp
ecte
d re
turn
Intraday (PCA)
-0.4 -0.2 0 0.2 0.4
Predicted return
-0.4
-0.2
0
0.2
0.4
Exp
ecte
d re
turn
Overnight (PCA)
-0.4 -0.2 0 0.2 0.4
Predicted return
-0.4
-0.2
0
0.2
0.4
Exp
ecte
d re
turn
Daily (PCA)
⇒ 4 PCA factors explain intraday and overnight expected returnsbetter than Fama-French factors
⇒ Asset pricing model still rejected
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Leverage Effect
Leverage effect
Leverage effect = correlation between asset return and volatility
Nonparametric estimation of systematic and idiosyncratic leverageeffect
Two estimation approaches:
Daily implied volatilities and returnsSpot volatilities based on HF data
Main finding: systematic risk drives the leverage effect
Important consequence: Argument for risk premium explanation andagainst leverage story
Average leverage effect
I measure the leverage effect with the continuous quadratic covariation:
ρ =[σ2,X ]CT√
[X ,X ]CT
√[σ2, σ2]CT
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Leverage Effect
Leverage effect: Componentwise
Componentwise leverage effect:
Decompose Xi and σ2i into systematic and idiosyncratic part:
X systi ,X idio
i , σ2systi and σ2idio
i
Calculate correlation for each component:
ρsyst,systi =[σ2syst
i ,X systi ]CT√
[X systi ,X syst
i ]CT
√[σ2syst
i , σ2systi ]CT
and similarly forρsyst,idioi , ρidio,systi , ρidio,idioi , ρsyst,totali , ρtotal,systi , ρtotal,totali .
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Leverage Effect
Main Result: Componentwise leverage effect
0 100 200 300 400 500 600−0.6
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4Componentwise leverage effect
Correlation of systematic return with total volatilityCorrelation of idiosyncratic return with total volatilityCorrelation of systematic return with idiosyncratic volatility
Cross-sectional distribution of componentwise leverage effect in 2012:
4 statistical return factors and 1 statistical volatility factor.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Leverage Effect
Leverage effect in 2012: Componentwise
0 100 200 300 400 500 600−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6Componentwise leverage effect
LEV(total,total)LEV(syst,total)LEV(idio,total)LEV(syst,syst)LEV(idio,idio)LEV(syst,idio)LEV(idio,syst)
Cross-sectional distribution of componentwise leverage effect in 2012:
4 statistical return factors and 1 statistical volatility factor.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Conclusion
Conclusion
Methodology
Asymptotic statistics for a factor model of high dimension based ongeneral continuous time processes
Quadratic variation of stochastic processes substitutes thecovariance in the principal component analysis
Truncated quadratic variation estimates continuous and jumpcovariance matrix, which identifies the systematic jump risk factors
Empirical Results
Stable continuous factor structure ⇒ First three factors are market,oil and finance; 4th factor potentially electricity
1 stable jump market factor
Decomposition of leverage effect⇒ negative correlation mainly due to systematic part
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Differences to Long-Horizon Factor Models
After rescaling the increments, we can interpret the quadraticcovariation estimator as a sample covariance estimator:
M∑j=1
(∆jXi )2 =
T
M
M∑j=1
(∆jXi√
∆M
)2
But limiting object will be a random variable!
⇒ Path-wise instead of population arguments
Jumps lead to “heavy-tailed rescaled increments”
⇒ Cannot be accommodated in long-horizon model
Asymptotic distribution have a mixed Gaussian limit
⇒ Generally have heavier tails than a normal distribution.⇒ Need stronger mode of convergence (stable convergence in
law) for confidence intervals
Non-stationarity in stochastic volatility or stochastic intensity jumpmodels
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Identifying factors
Problem: Factors only identified up to invertible lineartransformations
Need a measure for how close vector spaces are to each other
Generalized correlations
Set of KG economic candidate factors G
Generalized correlations are the min(K ,KG ) largest eigenvalues ofthe matrix [F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]
Example: Generalized correlations {1, 1, 0} for K = KG = 3: ⇒linear combination of G can replicate 2 factors in F
⇒ Use F for calculating generalized correlations
⇒ Asymptotic confidence intervals for sum of squared correlations:Total generalized correlation: trace
([F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]
).
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Data cleaning
Data
For each year take the intersection of stocks traded each day withthe stocks that have been in the S&P500 index at any point during1993-2012
Stock elimination criteria:Stock is dropped if any of the following conditions is true:
All first 10 5-min observations are missing in any of the dayThere are in total more than 50 missing values before the firsttrade of each dayThere are in total more than 500 missing values in the year
Year 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Original 614 620 622 612 609 606 610 603 587 600Cleaned 446 540 564 577 585 598 608 597 581 593Dropped 27.36% 12.90% 9.32% 5.72% 3.94% 1.32% 0.33% 1.00% 1.02% 1.17%
Table: Observations after data cleaning
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Data summary
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Percentage of increments identified as jumpsa=3 0.011 0.011 0.011 0.010 0.010 0.009 0.008 0.008 0.007 0.008a=4 0.002 0.002 0.002 0.002 0.002 0.001 0.001 0.001 0.001 0.001a=4.5 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.000 0.001
Variation explained by jumpsa=3 0.19 0.19 0.19 0.16 0.21 0.16 0.16 0.15 0.12 0.15a=4 0.07 0.07 0.07 0.05 0.10 0.06 0.06 0.06 0.03 0.05a=4.5 0.05 0.04 0.05 0.04 0.08 0.04 0.05 0.05 0.02 0.04
Percentage of jump correlation explained by first 1 jump factora=3 0.05 0.03 0.03 0.03 0.06 0.07 0.08 0.19 0.12 0.06a=4 0.03 0.02 0.02 0.04 0.08 0.06 0.08 0.25 0.09 0.08a=4.5 0.03 0.03 0.02 0.05 0.09 0.06 0.08 0.22 0.12 0.09
Percentage of continuous correlation explained by first 4 continuous factors0.26 0.20 0.21 0.22 0.29 0.45 0.40 0.40 0.47 0.31
1 Fraction of increments identified as jumps for different thresholds.
2 Fraction of variation explained by jumps for different thresholds.
3 Systematic jump correlation for different thresholds.
4 Systematic continuous correlation
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Stability of continuous factor structure for 6 years
2007 2008 2009 2010 2011 2012
1.00 1.00 1.00 1.00 1.00 1.001.00 1.00 1.00 1.00 1.00 1.000.99 0.99 1.00 1.00 1.00 0.980.97 0.96 0.98 0.99 0.99 0.98
Generalized correlations of yearly 4 continuous factors with 4continuous factors for 2007-2012
⇒ Continuous factors are very stable over time
⇒ Same results for estimation with yearly or 6 year horizon
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Stability of continuous factor structure for 10 years
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.97 0.99 0.99 0.99 1.00 1.00 1.00 1.00 0.99 0.990.95 0.97 0.98 0.99 0.99 0.99 0.97 0.98 0.99 0.980.47 0.63 0.17 0.67 0.99 0.99 0.94 0.92 0.97 0.96
Generalized correlations of yearly 4 continuous factors with 4continuous factors for 2003-2012
⇒ 4th continuous factor disappears in 2003-2006
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Stability of continuous factor structure for 1 year
1 2 3 4 5 6 7 8 9 10 11 12
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.99 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.99 0.99 0.99 0.99 0.99 1.00 1.00 0.99 0.99 1.00 0.99 0.990.98 0.93 0.99 0.97 0.98 0.98 0.98 0.99 0.99 0.96 0.90 0.96
Generalized correlation of monthly continuous factors with yearlycontinuous factors in 2011.
The yearly number of factors is K = 4.
⇒ Continuous factors are highly stable within a year horizon
⇒ Same results for estimation with yearly or monthly horizon
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Identification of factors
Continuous factors
Intuition: Identify pattern in the loadings
4 economic candidate factors:
Market (equally weighted)Oil and gas (40 equally weighted assets)Banking and Insurance (60 equally weighted assets)Electricity (24 equally weighted assets)
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Identification of factors
2007-2012 2007 2008 2009 2010 2011 2012
1.00 1.00 1.00 1.00 1.00 1.00 1.000.98 0.98 0.97 0.99 0.97 0.98 0.930.95 0.91 0.95 0.95 0.93 0.94 0.900.80 0.87 0.78 0.75 0.75 0.80 0.76
Generalized correlation of market, oil, finance and energy factors withfirst four largest statistical factors for 2007-2012
⇒ Stable continuous factor structure
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Identification of factors
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.97 0.99 1.00 1.00 0.99 0.97 0.98 0.96 0.98 0.950.57 0.75 0.77 0.89 0.85 0.92 0.95 0.92 0.93 0.830.10 0.23 0.16 0.35 0.82 0.74 0.72 0.68 0.78 0.78
Generalized correlation of market, oil, finance and energy factors withfirst four largest statistical factors for 2003-2012
⇒ Finance factor disappears in 2003-2006
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Robustness of identification factors: Daily data
Fama-French factors do not explain statistical factors
Use CRSP daily excess returns to form factor portfolios for2007-2012:
4 statistical factors based on continuous loadings4 economic industry portfolios4 Fama-French Carhart Factors
Generalized correlations based on daily data
Generalized correlations with 4 economic industry factors1.00 0.97 0.92 0.79
Generalized correlations with 4 Fama-French Carhart Factors0.95 0.74 0.60 0.00
⇒ Fama-French factors do not explain statistical factors
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Identification of factors: Statistical test
4 statistical and 3 economic 4 statistical and 4 economicˆρ SD 95% CI ˆρ SD 95% CI
2007-2012 2.72 0.001 (2.71, 2.72) 3.31 0.003 (3.30, 3.31)2007 2.55 0.06 (2.42, 2.67) 3.21 0.01 (3.19, 3.22)2008 2.66 0.08 (2.51, 2.81) 3.18 0.29 (2.62, 3.75)2009 2.86 0.10 (2.67, 3.05) 3.42 0.15 (3.14, 3.71)2010 2.80 0.04 (2.72, 2.88) 3.38 0.01 (3.37, 3.39)2011 2.82 0.00 (2.82, 2.82) 3.47 0.06 (3.35, 3.58)2012 2.62 0.03 (2.56, 2.68) 3.25 0.01 (3.24, 3.26)
Total generalized correlations (=sum of squared generalizedcorrelations) with standard deviations and confidence intervals
3 economic factors (market, oil and finance) and 4 economic factors(additional electricity factor).
Values of 3 respectively 4 mean perfect replication
⇒ Reject hypothesis of perfect replication
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of total factors
2 4 6 8 10 12 14 16 18 201
1.2
1.4Pe
rturb
ed E
RPerturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
201220112010Critical value
200920082007Critical value
2006200520042003Critical value
Number of total factors
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Identification of jump factors
Disontinuous factors
Jump factors are different from continuous factors
Results for jump factors depend on jump threshold, while results forcontinuous factors are robust to this choice
Only 1 stable factor: equally weighted market jump factor
The higher the jump threshold, the less stable the jump factorstructure
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Interpretation of jump factors
2007-2012 2007 2008 2009 2010 2011 2012
a=3 1.00 1.00 1.00 0.99 1.00 1.00 1.000.85 0.95 0.62 0.86 0.81 0.86 0.830.61 0.77 0.40 0.76 0.31 0.61 0.590.21 0.10 0.22 0.50 0.10 0.20 0.28
a=4 0.99 0.99 0.95 0.94 1.00 0.99 0.990.74 0.53 0.41 0.59 0.90 0.53 0.570.31 0.35 0.29 0.44 0.39 0.35 0.420.03 0.19 0.20 0.09 0.05 0.14 0.16
a=4.5 0.99 0.99 0.91 0.91 1.00 0.98 0.990.75 0.54 0.41 0.56 0.93 0.55 0.750.29 0.35 0.30 0.40 0.68 0.38 0.290.05 0.18 0.22 0.04 0.08 0.03 0.05
Table: Generalized correlations of market, oil, finance and electricity jumpfactors with first 4 jump factors from 2007-2012
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of jump factors
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
201220112010Critical value
200920082007Critical value
2006200520042003Critical value
Number of jump factors with truncation level a = 3.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of jump factors
2 4 6 8 10 12 14 16 18 201
1.2
1.4Pe
rturb
ed E
R
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
201220112010Critical value
200920082007Critical value
2006200520042003Critical value
Number of jump factors with truncation level a = 4.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of jump factors
2 4 6 8 10 12 14 16 18 201
1.2
1.4Pe
rturb
ed E
R
Perturbed Eigenvalue Ratio
201220112010Critical value
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
200920082007Critical value
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2006200520042003Critical value
Number of jump factors with truncation level a = 4.5.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Stability of jump factor structure
2007 2008 2009 2010 2011 2012
a=3
1.00 1.00 1.00 1.00 1.00 1.000.96 1.00 0.95 0.98 0.76 0.850.81 0.88 0.84 0.69 0.59 0.700.12 0.81 0.14 0.25 0.05 0.17
a=4
1.00 1.00 0.99 1.00 0.99 0.990.66 0.99 0.63 1.00 0.51 0.430.34 0.52 0.09 0.97 0.43 0.200.14 0.03 0.05 0.17 0.13 0.03
a=4.5
0.99 0.99 0.98 1.00 0.99 0.990.79 0.97 0.77 1.00 0.40 0.490.28 0.44 0.28 0.96 0.26 0.240.05 0.16 0.01 0.53 0.11 0.07
Table: Generalized correlations of 4 largest yearly jump factors with 4jump factors for 2007-2012
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Comparison of factor portfolios based on HF data anddaily returns
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.000.99 0.97 0.99 0.99 0.99 1.00 0.99 0.98 0.99 0.990.98 0.94 0.94 0.97 0.95 0.98 0.98 0.98 0.98 0.970.55 0.65 0.83 0.17 0.47 0.76 0.98 0.96 0.93 0.93
Generalized correlations between continuous factors based oncontinuous data and daily data
Continuous factors are constructed based on loadings from HF anddaily
⇒ Daily returns estimate similar but noisier loadings and factors ascontinuous HF data.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of factors with daily CRSP returns
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
201220112010Critical value
200920082007Critical value
2006200520042003Critical value
Number of factors based on daily CRSP returns
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Estimating volatility
Estimating volatility with realized quadratic variation
Use quadratic variation for short horizon (e.g. a day ) as estimatorfor real-world volatility
Estimating implied volatility with option data
Use at-the-money implied Black-Scholes volatility for short maturityoptions as estimator for risk-neutral volatility
Interpolate volatility surface to obtain theoretical at-the-money andshort maturity implied volatility
Which volatility estimator?
For factor analysis and leverage effect real-world and risk-neutralvolatility essentially equivalent
Implied volatility more reliable for volatility of volatility and leverageeffect estimation
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Data
Data
Daily prices for standard call and put options from OptionMetrics
Same firms and time period as for HF data
Implied volatility for 30 days at-the-money options usinginterpolated volatility surface
Average the implied call and put volatilities
More robust than generalized implied volatility
Data cleaning: Remove the volatilities greater than 200% of theaverage of a 31 days moving window centered at the day
Year 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Original 408 479 507 525 543 557 565 561 549 565Cleaned 399 465 479 495 508 528 536 530 523 529Dropped 2.21% 2.92% 5.52% 5.71% 6.45% 5.21% 5.13% 5.53% 4.74% 6.37%
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of volatility factors
2 4 6 8 10 12 14 16 18 201
1.2
1.4Pe
rturb
ed E
RPerturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
2 4 6 8 10 12 14 16 18 201
1.2
1.4
Pertu
rbed
ER
Perturbed Eigenvalue Ratio
201220112010Critical value
200920082007Critical value
2006200520042003Critical value
Number of volatility factors
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Identification of volatility factors
Time-variation and identification of volatility factors
2007 2008 2009 2010 2011 2012
1.00 1.00 1.00 1.00 1.00 1.000.19 0.90 0.92 0.33 0.67 0.280.07 0.34 0.13 0.06 0.11 0.050.01 0.05 0.00 0.00 0.01 0.01
Table: Generalized correlation of market, oil and finance volatility factorswith the first 4 largest statistical volatility factors
⇒ 1 stable market factor and 1 temporary finance factor
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Leverage effect in 2012: Componentwise
0 100 200 300 400 500 600−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6Componentwise leverage effect
LEV(total,total)LEV(syst,total)LEV(idio,total)LEV(total,total) (HF)LEV(syst,total) (HF)LEV(idio,total) (HF)
Figure: Componentwise leverage effect in 2012 based on implied andhigh-frequency volatilities. 4 continuous asset factors.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Leverage effect in 2012: Componentwise
0 100 200 300 400 500 600−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6Componentwise leverage effect
LEV(total,total) (cont)LEV(syst,total) (cont)LEV(idio,total) (cont)LEV(total,total) (day)LEV(syst,total) (day)LEV(idio,total) (day)
Figure: Componentwise leverage effect in 2012 with daily continuous logprice increments LEV (cont) and daily returns LEV (day) and 4 assetfactors.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Leverage effect in 2012: Componentwise
0 100 200 300 400 500 600−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6Componentwise leverage effect
LEV(total,total)LEV(syst,total) (stat)LEV(idio,total) (stat)LEV(syst,total) (FFC)LEV(idio,total) (FFC)
Figure: Componentwise leverage effect in 2012 with 4 continuous dailyfactors LEV (stat) or 4 Fama-French-Carhart factors LEV (FFC ).
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Reversal Effect
-0.5 0 0.5
Expected return
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Exp
ecte
d in
trad
ay r
etur
n
Overnight reversal
⇒ Expected overnight and intraday returns are stronglynegatively correlated
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Predicted Reversal Effect
-0.5 0 0.5
Predicted overnight return
-0.5
0
0.5
Pre
dict
ed in
trad
ay r
etur
n
PCA
-0.5 0 0.5
Predicted overnight return
-0.5
0
0.5
Pre
dict
ed in
trad
ay r
etur
n
Fama-French 3
-0.5 0 0.5
Predicted overnight return
-0.5
0
0.5
Pre
dict
ed in
trad
ay r
etur
n
Market
⇒ PCA and Fama-French factors predict reversal relationship inexpected returns
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Assumptions
Assumption 1: Weak dependence of error terms
The largest eigenvalue of the residual quadratic covariation matrix isbounded in probability, i.e.
λ1([e, e]) = Op(1).
Define the instantaneous predictable quadratic covariation as
d〈ei , ek〉tdt
=: Gi,k(t).
Largest eigenvalue of the matrix G (t) is almost surely bounded forall t:
λ1(G (t)) < C a.s. for all t for some constant C .
⇒ Approximate factor structure: Residual risk can be diversified
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Factor structure assumptions
Assumption 2: Weaker dependence of error terms
The row sum of the quadratic covariation of the residuals is bounded inprobability:
N∑i=1
‖[ek , ei ]‖ = Op(1) ∀k = 1, ...,N
⇒ Stronger than bounded eigenvalues in Assumption 1.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Consistency of loadings
Theorem 1: Consistency of estimators
Define the rate δ = min(N,M). Then for δ →∞1 Consistency of loadings estimator: Under Assumption 1
Λi − H>Λi = Op
(1√δ
).
2 Consistency of factor estimator and common component(CT ,i = FTΛi and CT ,i = FT Λi ): Under Assumptions 1 and 2
FT − H−1FT = Op
(1√δ
), CT ,i − CT ,i = Op
(1√δ
).
⇒ Need both N and M to go to ∞⇒ Curse of dimensionality turns into “ blessing”.
⇒ Full-rank matrix H known
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Separating jumps from continuous movements
Separating continuous and jump part
Intuition: Large increments = jumps
Truncation estimator for continuous part:
XCj,i = Xj,i1{|Xj,i |≤aj,i∆ω}
instead of Xj,i with aj,i > 0 and ω ∈ (0, 12 )
Truncation estimator for jump part:
XDj,i = Xj,i1{|Xj,i |>aj,i∆ω}
Set aj,i∆ω = a · σC
j,i ·∆0.49 with σCj,i local window estimator of
volatility and a = 3, 4 and 4.5
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Estimation of jump and continuous risk
Estimate jump and continuous factors
X (t) = ΛCFC (t) + ΛDFD(t) + e(t).
Truncation estimator
Use truncation estimator to construct estimators for
[X ,X ]︸ ︷︷ ︸total risk covariance
[X ,X ]C︸ ︷︷ ︸continuous risk covariance
[X ,X ]D︸ ︷︷ ︸jump risk covariance
Apply PCA to XC>XC
N ⇒ ΛC and FC
Apply PCA to XD>XD
N ⇒ ΛD and FD
⇒ Estimators ΛC , ΛD , FC and FD are defined analogously to Λ and F ,but using XC and XD instead of X .
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Assumptions
Assumption 3: Truncation identification
F and ei have only finite activity jumps
Factor jumps are not “hidden” by idiosyncratic jumps:
P(∆Xi (t) = 0 if ∆(Λ>i F (t)) 6= 0 and ∆ei (t) 6= 0
)= 0.
(Always satisfied as soon as Levy measure of F and e have adensity)
Continuous [FC ,FC ] and jump [FD ,FD ] covariation matrices and
limN→∞ΛC>
ΛC
N and limN→∞ΛD>
ΛD
N are full rank.(Systematic jump factors need to jump in [0,T ] and affect manyassets.)
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Separating jumps from continuous movements
Theorem 2: Separating continuous and jump factors
Assumptions 1 and 3 hold and δ →∞:
1 The continuous and jump loadings can be estimated consistently:
ΛCi = HC>ΛC
i + op(1) , ΛDi = HD>ΛD
i + op(1).
2 Additionally Assumption 2. The continuous and jump factors canonly be estimated up to a finite variation bias term
FCT = HC−1
FCT + op(1) + finite variation term
FDT = HD−1
FDT + op(1) + finite variation term.
3 Additionally Assumption 2. Consistent estimation of the covariation
with any Ito-semimartingale Y (t) for√MN → 0:∑M
j=1 FCj Yj = HC−1
[FC ,Y ]T + op(1)∑Mj=1 F
Dj Yj = HD−1
[FD ,Y ]T + op(1)
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of factors
Estimating number of factors
Intuition: Number of factors = number of large eigenvalues
Denote the ordered eigenvalues of X>X by λ1 ≥ ... ≥ λN .
Under Assumption 1 and for O(NM
)≤ O(1):
systematic eigenvalues λ1, ..., λK = Op(N)non-systematic eigenvalues λk = Op(1) for k = K + 1, ...,N.
Possible estimation strategy: Look at at eigenvalue ratios λk
λk+1
⇒ Problem: non-systematic eigenvalues not bounded from below
Popular estimators with good performance use arguments ofrandom matrix theory to obtain
1 lower bounds on non-systematic eigenvalues2 clustering of non-systematic eigenvalues
Random matrix theory assumptions too restrictive for our model
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of factors
Theorem 3: Number of factors
Assumption 1 and O(NM
)≤ O(1):
Choose sequence g(N,M) s.t g(N,M)N → 0 and g(N,M)→∞.
Define perturbed eigenvalues and eigenvalue ratio statistics:
λk = λk + g(N,M) ERk =λk
λk+1
for k = 1, ...,N − 1
⇒ ERk clusters around 1 for k = K + 1, ...,K and explodes for k = K
The estimator for the number of factors is for any γ > 0:
K (γ) = max{k ≤ N − 1 : ERk > 1 + γ}
⇒ K (γ)p→ K for any γ > 0.
Additionally Assumption 3: Consistent estimation of KC and KD
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Number of factors
Number of factors
Two tuning parameters
Perturbation: I recommend g(N,M) =√N ·
median(λ1, ..., λN).(Results very robust to choice of perturbation)Cut-off: γ between 0.05 and 0.2
My estimator focuses on residuals spectrum:
⇒ Robust to strong and weak factors
Perturbation avoids random matrix theory
⇒ Weaker assumptions and estimation of continuous and jumpfactors possible
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations
Number of factors
K = 3 factors
Dominant first factor: σdominant =√
10
Large signal-to-noise ratio θ = 6
Cross-sectional correlation: Toplitz matrix A with parameters{1, 0.5, 0.5, 0.5, 0.52}.
Toy model to make it comparable with other estimators that requirestronger assumptions:
X (t) = ΛWF (t) + θAWe(t)
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations
Number of factors
ERP1: Perturbation g(N,M) =√N ·median{λ1, ..., λN}, cutoff
γ = 0.2
ERP2: Perturbation g(N,M) = logN ·median{λ1, ..., λN}, cutoffγ = 0.2
Onatski (2010): Clustering in eigenvalue-difference
Ahn and Horenstein (2013): Maximum in eigenvalue ratio
Bai and Ng (2002): Information criterion
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations: Number of Factors
20 40 60 80 100 120 140 160 180 200 2200
0.5
1
1.5
2
2.5
3
3.5
4RM
SE
Error in estimating the number of factors
N,M
ER perturbed 1ER perturbed 2OnatskiAhnBai
Figure: RMSE (root-mean squared error) for the number of factors fordifferent estimators with N = M.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations: Number of Factors
ERP1 ERP2 Onatski Ahn Bai
RMSE 0.32 0.18 0.49 4.00 3.74Mean 2.79 2.88 2.76 1.00 1.09Median 3 3 3 1 1SD 0.52 0.41 0.66 0.00 0.28Min 1 1 1 1 1Max 3 4 5 1 2
Table: N = M = 125.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations
Simulation parameters
Heston-type stochastic volatility model with jumps.
K factors are modeled as
dFk(t) =(µ− σ2Fk
(t))dt + ρFσFk(t)dWFk
(t)
+√
1− ρ2FσFk
(t)dWFk(t) + JFk
dNFk(t)
dσ2Fk
(t) =κF(αF − σ2
Fk(t))dt + γFσFk
(t)dWFk(t)
N residual processes follow similar dynamics
Brownian motions WF , WF ,We , We independent.
Standard parameter values: κF = κe = 5, γF = γe = 0.5,ρF = −0.8, ρe = −0.3, µ = 0.05, αF = αe = 0.1, T = 1.
Compound Poisson process with intensity νF = νe = 6 and normallydistributed jumps with JFk
∼ N(−0.1, 0.5) and Jei ∼ N(0, 0.5).
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations
Simulation parameters
Factors and residuals follow Heston-type stochastic volatility modelwith jumps
Standard parameters for dynamics
X (t) = ΛF (t) + θAe(t)
Toplitz matrix A captures cross-sectional dependenceθ measures signal to noise ratio
Strong and weak factors: First factor scaled up by σdominant
Consistency: Measure correlation of estimated factor with truefactor for K = 1
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations
Simulation parameters
Asymptotic distribution:
CLTC =
(1
NVT ,i +
1
MWT ,i
)−1/2 (CT ,i − CT ,i
)CLTF =
√NΘ−1/2F (FT − H−1FT )
CLTΛ =√MΘ
−1/2Λ,i (Λi − H>Λi )
Theory predicts standard normal distribution N(0, 1)
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations: Consistency
N=200, M=250 Case 1 Case 2 Case 3
Total Continuous JumpCorrelation Factors 0.994 0.944 0.972 0.997 0.997SD Factors 0.012 0.065 0.130 0.001 0.000Correlation Loadings 0.995 0.994 0.975 0.998 0.998SD Loadings 0.010 0.008 0.127 0.001 0.000
Case 1: Stochastic volatility with jumps and residual cross-sectionalcorrelation
Case 2: Stochastic volatility and residual cross-sectional correlation
Case 3: Brownian motions only
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations: Asymptotic Distribution
−5 0 50
20
40
60
80Common components
−5 0 50
20
40
60
80
100
120Factors
−5 0 50
20
40
60
80Loadings
Figure: N = 200 and M = 250. Histogram of standardized commoncomponents CLTC , factors CLTF and loadings CLTΛ. The normal densityfunction is superimposed on the histograms. Stochastic volatility withjumps and residual cross-sectional correlation.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Simulations: Number of Factors
2 4 6 8 10 12 141
2
3
k
ER p
ertu
rbed
Total number of factors
2 4 6 8 10 12 1412345
k
ER p
ertu
rbed
Total number of continuous factors
2 4 6 8 10 12 14
2
4
6
k
ER p
ertu
rbed
Total number of jump factors
Figure: Perturbed eigenvalue ratios (ERP1) for Heston-type stochasticvolatility model with K = 3, KC = 3, KD = 1, σdominant = 3, N = 200and M = 250 for 100 simulated paths.
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Asymptotic distribution
Key idea: Asymptotic expansion
Loadings:
√M(
Λi − H>Λi
)= V−1
MN
(Λ>Λ
N
)√MF>ei + Op
(√M
δ
)⇒ Apply asymptotic distribution results of quadratic covariation
estimator to√MF>ei
Factors:
√N(FT − H−1FT
)=
1√NeTΛH + OP
(√N√M
)+ Op
(√N
δ
)⇒ Apply martingale central limit theorem to 1√
NeTΛH
We obtain stable convergence in law (stronger than convergence indistribution) and a mixed-Gaussian limit
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Asymptotic distribution of loadings
Theorem 4: Asymptotic distribution of loadings
Assumptions 1 and 2 hold. Then for δ = min(N,M)→∞:
⇒√M(
Λi − H>Λi
)= V−1
NM
(Λ>ΛN
)√MF>ei + Op
(√Mδ
)If√MN → 0:
⇒√M(Λi − H>Λi )
L−s−→ N(0,V−1QΓiQ
>V−1)
Γi =∫ T
0σ2Fσ
2eids +
∑s≤T
∆F 2(s)σ2ei (s) +
∑s′≤T
∆e2i (s ′)σ2
F (s ′)
V diagonal matrix of eigenvalues of Σ12
ΛΣFΣ12
Λ
plimN,M→∞
Λ>ΛN = Q
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Assumptions for asymptotic distribution of factors:
Assumption 4: Asymptotically negligible jumps of error terms
Jumps of the martingale 1√N
∑Ni=1 ei (t) are asymptotically negligible:
Λ>[e, e]tΛ
N
p→ 〈Z ,Z 〉t ,Λ>〈eD , eD〉tΛ
N
p→ 0 ∀t > 0.
with Z some continuous square integrable martingale with quadraticvariation 〈Z ,Z 〉t .
Assumption 5: Even weaker dependence of error terms
weak serial dependence of error terms
weaker cross-sectional dependence error terms
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Asymptotic distribution of the factors
Theorem 5: Asymptotic distribution of the factors
Assumptions 1-2 hold. Then for δ = min(N,M)→∞
⇒√N(FT − H−1FT ) = 1√
NeTΛH + OP
(√N√M
)+ Op
(√Nδ
)If Assumptions 4 and 5 hold and
√N
M → 0 or only Assumption 4
holds and NM → 0:
⇒√N(FT − H−1FT )
L−s−→ N(
0,Q−1>ΦTQ−1)
ΦT = plimN→∞
Λ>[e,e]ΛN and Q−1 = limδ→∞ H
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Asymptotic distribution of the common components
Theorem 6: Asymptotic distribution of the common components
Define CT ,i = Λ>i FT and CT ,i = Λ>i FT . Assumptions 1-4 hold.
1 If Assumption 5 holds, then for any sequence N,M
√δ(CTi − CTi
)√δ√NWT ,i +
√δ√MVT ,i
D→ N(0, 1)
2 Assume NM → 0 (but we do not require Assumption 5)
√N(CT ,i − CT ,i
)√
WT ,i
D→ N(0, 1)
with WT ,i = Λ>i Σ−1Λ ΦTΣ−1
Λ Λi and VT ,i = F>T Σ−1F ΓiΣ
−1F FT
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Estimation of Covariance Matrices
Feasible estimators of asymptotic covariance matrices
Loading estimator
Feasible estimation possible under Assumption 1 and 2Estimation based on local volatility estimation
Factors
Dimensionality problem: Need estimator of N ×N matrix [e, e]Stronger assumptions necessary, e.g. cross-sectionalindependence, parametric dependency or sparsity
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Identifying factors
Problem: Factors only identified up to invertible lineartransformations
Need a measure for how close vector spaces are to each other
Test for comparing two sets of factors
Set of KG economic candidate factors G
Generalized correlations are the min(K ,KG ) largest eigenvalues ofthe matrix [F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]
F and G describe the same factor model if the generalizedcorrelations are all equal to 1
Test H0: Sum of squared generalized correlations equal tomin(K ,KG )
Example: Generalized correlations {1, 1, 1} for K = KG = 3 ⇒same factor space
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Identifying factors
Total generalized correlation
Total generalized correlation (sum of squared correlations)
ρ = trace([F ,F ]−1[F ,G ][G ,G ]−1[G ,F ]
).
Estimator for the total generalized correlation
ˆρ = trace(
(F>F )−1(F>G )(G>G )−1(G>F )).
Trace operator is a differentiable function and the quadraticcovariation estimator is asymptotically mixed-normally distributed⇒ Delta-method argument.
Theorem 7: Asymptotic distribution of ˆρ
Assumption 1 and√MN → 0. Weak assumptions on G. Then for δ →∞:
√M(
ˆρ− ρ) L−s→ N(0,Ξ) and
√M√Ξ
(ˆρ− ρ
) D→ N(0, 1)
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Microstructure noise
Assumption 6: Microstructure noise
We observe the true asset price with noise εj,i :
Yi (j) = Xi (tj) + εj,i
Assume that the noise ε is i.i.d., independent of X and has bounded4th moments.
The noise creates serial correlation in increments
The noise does not vanish for M →∞
Solution: Sparse sampling (e.g. 5 min)
Denote increments of noise by εj,i = εj+1,i − εj,i
Introduction Factor model Empirical Results Asset Pricing Leverage Effect Conclusion Appendix
Microstructure noise
Theorem 8: Upper bound on impact of noise
Assumption 1 and NM → c < 1. Then for N,M →∞:
Upper bound on the impact of noise on largest residual eigenvalue:
λ1
((e + ε)>(e + ε)
N
)− λ1
(e>e
N
)≤ 2λmedian
(1 +√c
1−√c
)2
+ op(1)
Variance of microstructure noise is bounded by
σ2ε ≤
c
2(1−√c)2· λmedian + op(1).
where λmedian denotes the median eigenvalue of(
Y>YN
).
Empirical upper bound on noise impact is small ⇒ noise does not affect
the spectrum with 5 min sampling