Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
When do portfolios based on the first
principal component have short positions?
Abstract
The first principal component of stock returns is often identified with the mar-
ket factor. Empirical portfolios based on this principal component sometimes
contain short positions. We analyze how the stock return correlations affect the
positivity of the dominant principal component. If all the correlations are positive
this portfolio has positive weights. In practice some of the correlations can be
negative and in this case the weights on the first principal component may or may
contain negative values. We determine the characteristics of the correlation matrix
that lead to negative weights in the first principal component.
1
I. Introduction
The intuition that security returns can be explained by a set of common factors
is well entrenched in the finance literature. Sharpe (1963) assumed that a single
factor could explain the systemic component of returns. Ross (1976) assumed that
returns were generated by a set of common factors. These factors are derived from
the covariance matrix where the most important factors are associated with the
largest eigenvalues. The single most influential factor corresponds to the largest or
dominant eigenvalue. The factor loadings are obtained from the associated eigen-
vector. This factor represents the linear combination of securities that explains the
largest fraction of the total variance. Typically it is identified as the market factor.
Several1 papers discuss the connection between the first principal component of
stock returns and the market portfolio.
The market portfolio corresponds to the value weighted portfolio where the
weight of each stock is proportional to its market capitalization. In this case all
the portfolio weights are positive. To avoid confusion we will refer to the portfolio
constructed from the first principal component as the market factor portfolio. It is
well documented2 that the returns on the these two portfolios are highly correlated
11These papers include Trzcinka (1986), Connor and Korakjczk (1993), Geweke and
Zhou (1996), Jones (2001), and Merville and Xu (2002). See also Avellaneda and Lee
(2010), Laloux et al. (1999) Gopikrishnan et al. (2001) and Allez and Bouchaud (2011)
and Pelger (2015) .
2See for example Avellaneda and Lee (2010). They show that that the returns on the
market factor portfolio are similar to those of the capitalization weighted portfolio.
2
although in general the weights in these two portfolios will differ. It is now common
to compute the principal components of the correlation matrix rather than the
covariance matrix and we follow this convention.
The market factor portfolio has practical applications in investment and risk
management. It can be used to implement investment strategies. For example,
Avellaneda and Lee (2010) discuss its role and the role of the other factor portfolios
in statistical arbitrage. Alexander and Dimitriu (2004) find that investing in the
market factor portfolio outperforms both equally weighted and value weighted
strategies. Boyle (2014) finds that the market factor portfolio dominates the value
weighted portfolio in terms of Sharpe ratio and that its performance is comparable
with the equally weighted portfolio. Principal components have also been used to
measure systemic risk. Billio, Getmansky, Lo and Pelizzon (2010) and Kritzman,
Li, Page, and Rigobon (2011) describe how the principal components can be used
to monitor systemic risk.
Because the principal components are orthogonal there can be at most one of
them with strictly positive components. Typically this is the first principal com-
ponent corresponding to the dominant eigenvector. However in practice the first
principal component does not always have positive components. Avellaneda and
Lee (2010) observe that empirically based market factor portfolios can sometimes
have a few negative weights.
There are three reasons why market factor portfolios with all positive weights
are of interest. First, many classes of investor are restricted to long only portfolios.
Long only portfolios are simpler to construct in practice since taking short positions
3
involves institutional constraints. Second, if we want the market factor portfolio
to be used as a proxy3 for the CAPM market portfolio it should certainly have
positive weights. Finally, by knowing why the portfolio does not have positive
weights we may better understand the relevant characteristics of the correlation
matrix. This paper explores conditions under which the market factor portfolio
has positive weights. We establish the connection between the properties of the
correlation matrix and the existence of a market factor portfolio with positive
weights.
There is one situation when the relation is clear cut. If all the correlation
coefficients are positive then all the weights in the market factor portfolio will be
positive. This follows from the classic Perron-Frobenius 4 result, which states that
if a matrix has all its elements positive, the dominant eigenvector will also have all
its components positive. Hence if the dominant eigenvector has any negative com-
ponents then some of the elements in the correlation matrix must be negative. If
the correlation matrix has negative elements we find, based on our empirical work,
that there are two possible outcomes. In some cases the dominant eigenvectors of
such matrices are strictly positive. In other cases, the dominant eigenvectors do
not have this property.
We can distinguish three different classes of correlation matrices.
• Class I. The elements of the correlation matrix are all positive. In this
3Boyle (2014) proposes such a model where the market factor portflio is used as a
candidate for the CAPM market portfolio
4See Perron (1907) and Frobenius (1912).
4
case all the components of the dominant eigenvector are positive.
• Class II. Some of the elements in the correlation matrix are negative and
all the components of the dominant eigenvector are positive.
• Class III. Some of the elements in the correlation matrix are negative and
the dominant eigenvector has at least one negative component.
We now illustrate these three cases with three correlation matrices. First
consider C1 which has all its correlations positive. Let λ1 denote the vector of
eigenvalues and let v1 be the eigenvector associated with the largest eigenvalue of
λ1. We have
C1 =
1.00 0.50 0.45
0.50 1.00 0.40
0.45 0.40 1.00
λ1 =
0.49
0.61
1.90
v1 =
0.60
0.58
0.56
Since all the eigenvalues are positive, C1 is positive definite and is thus a valid
correlation matrix. All the components of v1 are positive. This is consistent with
the Perron-Frobenius theorem. Hence C1 belongs to Class I.
Next consider a correlation matrix , C2 with two negative elements.
C2 =
1.00 0.50 0.45
0.50 1.00 −0.40
0.45 −0.40 1.00
λ2 =
0.10
1.39
1.51
v2 =
0.78
0.59
0.23
5
All the components of the dominant eigenvector of C2 are positive. Hence
C2 generates a market factor portfolio with all its weights positive. This matrix
belongs to Class II.
The next correlation matrix, C3 looks similar to C2. It also has two negative
elements.
C3 =
1.00 0.50 −0.45
0.50 1.00 0.40
−0.45 0.40 1.00
λ3 =
0.10
1.39
1.51
v3 =
0.78
0.59
−0.23
However the dominant eigenvector of C3 has a negative component. In our context
C3 would generate a market factor portfolio with a negative weight on the third
asset. Hence C3 belongs to Class III.
In this paper we use empirical data to explore the connection between the
properties of the correlation matrix and the positivity of the dominant eigenvec-
tor. We compute return correlations using stocks from the S&P 1500 between
1990 and 2013. The period is divided into five subperiods. We analyze the fre-
quency, magnitude and patterns of negative correlations in this data over different
subperiods. We find that the frequency of negative correlations has declined over
time. For each period we construct batches of empirical correlation matrices and
analyze the signs of the dominant eigenvectors.
We find examples of each of the three Classes in the empirical data. The data
contains matrices where all the correlations are positive corresponding to Class
I. It also includes correlation matrices with some negative elements that belong
6
to Class II. In other words the matrix has some negative correlations and all
the components of the dominant eigenvector are positive. The data also contains
correlation matrices with some negative elements that belong to Class III. For these
matrices there is at least one negative component in the dominant eigenvector. We
find that in the early part of period Class II and III dominate but that Class I
matrices become more common toward the end of the period. Furthermore if
we just examine matrices with some negative correlations, we find that Class II
matrices become relatively more important than Class III matrices over the 24 year
period. We also document that negative correlations typically line up together in
rows reflecting the fact that certain stock are negatively correlated with a group
of other stocks.
The main finding is that the frequency, size and pattern of the negative ele-
ments determine whether or not the dominant eigenvalue has any negative com-
ponents. Broadly speaking the more pervasive the negative correlations the more
likely it is that there will be short positions. We also examined the relationship
between the correlation matrix and the existence of a strictly positive dominant
eigenvector using analytical methods. These results help us understand the intu-
ition behind some of our empirical findings. For example, if a stock is negatively
correlated with all the other stocks we can prove that it will have a negative weight
in the market factor portfolio.
The rest of the paper is organized as follows. The next section contains our
empirical analyses of stock return correlations. Section III explores the connection
between the properties of the correlation matrix and the signs of the weights in the
7
dominant eigenvector. Section IV contains our analytical results while the final
section summarizes the paper.
8
II. Empirical Analysis: Correlations
In this section we describe our empirical analysis of stock return data. We examine
the correlation matrix of stock returns over a 24 year recent period. We provide
details of the distribution of negative correlations in these matrices in terms of
their frequency and also of their patterns. We computed daily returns for the
components of the S&P1500 index between January 1, 1990 to December 31, 2013.
This index is comprised of 500 US large cap stocks, 400 US mid cap stocks and
600 US small cap stocks.
We divided the time period into five non-overlapping subperiods. Table 1 gives
details of the sub periods and the number of stocks used in each subperiod.
Period Duration of period Number of Stocks1 01/01/1990 - 31/12/1993 8902 01/01/1994 - 31/12/1998 10303 01/01/1999 - 31/12/2003 11884 01/01/2004 - 31/12/2008 12975 01/01/2009 - 31/12/2013 1450
Table 1: Summary of time periods and sample sizes for stock returndata
We first examine how the frequency of negative correlations has varied over
time across the five periods. We computed the pairwise correlations of all the
stocks in our sample for each period. For the first sub period, 1990-1993 we find
that 10.75% of all the correlations are negative. This percentage is computed
as follows. There are 890 stocks in this subperiod so this gives . 890 × 890 =
792, 100 correlations, of these 85,182 were negative so the percentage of negative
9
correlations for the 1990-1993 subperiod 10.75. The results are given in Table 2.
Subperiod Dates Percent of negative correlationsin the sample
for this subperiod1 1990− 1993 10.752 1994− 1998 4.173 1999− 2003 3.124 2004− 2008 0.395 2009− 2013 0.12
Table 2: Summary of frequency of negative correlations across the fivetime periods
It is clear that the frequency of negative correlations declined significantly
over the sample period. During (1900 − 1993) over ten percent of the correlations
were negative whereas during the last subperiod, (2009 − 2013) the percentage of
negative correlations had shrunk to 0.12.
We next examine the location of these negative correlations to discern if there is
any structure. For our purposes it is very convenient to use correlation heat maps
to summarize the pattern of a large number of correlation coefficients. In these
heat maps we use different colours to represent the magnitude of the correlations.
Light green represents a correlation of 0.8. As the magnitude of the correlation
decreases the colour becomes progressively darker. If the correlation is negative
it is represented by red. Figures 1 through 5 contain the heat maps for the five
subperiods.
There are two striking features of these heat maps. First the negative correla-
tions tend to appear in rows(and of course in columns because of symmetry). This
is most evident in Figure 1 where we observe pronounced horizontal and vertical
10
red lines but the same pattern is observed in all the other heat maps. The second
feature is that the heat maps become progressively lighter over time which means
that correlations have steadily increased over time during this period.
Figure 1: Correlation heatmap of the stock universe for 1990-1993.
Daily Return Correlation sp1500_90_93
0.0
0.2
0.4
0.6
0.8
11
Figure 2: Correlation heatmap of the stock universe for 1994-1998.
Daily Return Correlation sp1500_94_98
0.0
0.2
0.4
0.6
0.8
Figure 3: Correlation heatmap of the stock universe for 1999-2003.
Daily Return Correlation sp1500_99_03
0.0
0.2
0.4
0.6
0.8
12
Figure 4: Correlation heatmap of the stock universe for 2004-2008.
Daily Return Correlation sp1500_04_08
0.0
0.2
0.4
0.6
0.8
Figure 5: Correlation heatmap of the stock universe for 2009-2013.
Daily Return Correlation sp1500_09_13
0.0
0.2
0.4
0.6
0.8
13
The tendency for negative correlations to appear in rows reflects the fact that
certain stocks (for example gold stocks) tend to have negative correlations with a
lot of other stocks. We shall show in the next Section that if a correlation matrix
contains several negative correlations in any of its rows, this generally increases the
likelihood that the matrix will belong to Class III. In other words rows of negative
of correlations typically lead to short positions in the market factor portfolio.
14
III. Characteristics of the correlation matrix
causing short positions
In this section we use our empirical correlation data to compute market factor port-
folios. We investigate the extent to which the market factor portfolios have strictly
positive weights and analyze the circumstances which lead to short positions in the
market factor portfolios. We show that empirical correlation matrices with some
negative elements can sometimes produce market factor portfolios with all long po-
sitions. However empirical correlation matrices with some negative elements can
also produce market factor portfolios with some short positions. We investigate
which features of the correlation matrix lead to this difference. We show that the
extent to which negative correlations occur in rows has an important impact on
the signs of the dominant eigenvector.
In our analysis the market factor portfolio weights are obtained from the prin-
cipal eigenvalue of the correlation matrix. If C is the correlation matrix and λ is
the principal eigenvalue with corresponding eigenvector v, we have
Cv = λv
The weights on the market factor portfolio, w, are obtained by normalizing the
vector v so that its components sum to one
w =v
v′e.
15
Here e denotes the unit vector.
To explore the connection between the properties of the input correlation ma-
trix and the sign of the dominant eigenvector we used the following approach.
For each subperiod, we generated 10,000 random samples of 50 stocks each and
obtained the sample correlation matrix based on these 50 stocks. There are three
reasons for using this approach. First it ensures that all our correlation matrices
are positive5 definite. Second this approach enables us to have a much richer spec-
trum of input correlation matrices to analyze. Third, the correlations are realistic
in that they come from empirical data.
For each of the 10,000 sample matrices in each subperiod we determine6 whether
it belongs to Class I, Class II or Class III. Recall that Class I matrices contain
only positive correlations and so they automatically have a strictly positive domi-
nant eigenvector. Class II matrices contain some negative elements but also have
a strictly positive dominant eigenvector. Class III matrices contain some negative
elements and do not have a strictly positive dominant eigenvector. The breakdown
of the matrices among these three Classes for our different subperiods is shown in
Figure 6.
Note that the number of Class I matrices is negligible for the first three subpe-
riods. However they dominate in the last two subperiods. This is what we would
expect from the distribution of negative elements in Table 2. During the first
three periods the majority of the 10,000 matrices have some negative elements.
5This would not be case if we used the entire population.
6This is an easy operation in matlab using the eig function.
16
During the first subperiod these matrices are split almost evenly between Class II
and Class III but as time passes Class II matrices become more relatively more
frequent than Class III matrices. Class I and II matrices give rise to market factor
portfolios with positive weights whereas Class III matrices do not. We are inter-
ested in the relevant properties of a correlation matrix with negative elements that
determine whether it is in Class II or Class III.
Figure 6: Breakdown of matrices by Class
17
We compare the distribution of negative correlations in Class II and Class III
matrices. We compute the average number of negative correlations in these two
Classes for our five subperiods. The results are given in Table 3 which shows
that Class III matrices have consistently more negative correlations than Class II
matrices. In the later periods the percentage of negative elements in both Class II
and Class III matrices declines but the relative decline is much greater for Class II
matrices By comparing Table 3 with Table 2, we see that the average percentage
of negative elements in the Class II matrices is very similar to the frequency of
negative elements in the underlying population. The Class II matrices in Table 3
are average in terms of their share of negative elements while Class III matrices
are exceptional in that they contain more than their share of negative elements.
We also compare the average magnitudes of the negative correlations for the
these two Classes. The results are given in Table 4. From this Table we see that
the absolute value of the average of the negative correlations is greater for Class
III matrices than for Class II matrices.
Average percentage of Average percentage ofSubperiod negative correlations negative correlations
in Class II matrices in Class III matrices1 9.32 11.752 3.73 5.863 2.57 5.214 0.42 3.135 0.22 2.95
Table 3: Average percentages of negative correlations in Class II andClass III matrices for different subperiods.
In summary Class III correlation matrices tend to have more negative elements
18
Average value of Average value ofSubperiod negative correlations negative correlations
in Class II matrices in Class III matrices1 -0.019 -0.0212 -0.016 -0.0183 -0.018 -0.0274 -0.018 -0.0445 -0.013 -0.031
Table 4: Average values of negative correlations in Class II and ClassIII matrices for different subperiods.
than Class II matrices and also the average absolute size of the negative elements
is larger for Class III matrices. The distributions of the negative elements for the
Class II and Class III matrices can be compared from the plots provided in the
Appendix A.
There is another aspect of the negative correlations that is different between
Class II and Class III matrices. This has to do with the location of the negative
correlations in the empirical data. We have seen from the heat maps that negative
correlations are not randomly distributed but that they tend to occur in rows. In
the case of some stocks a high proportion of their correlations with other stocks
are negative. There is a useful theoretical result7 which holds in the extreme case
when all the correlations in a given row are negative. If a correlation matrix has
a complete row of negative elements then it must belong to Class III. This last
result predicts that correlation matrices which have a row with a relatively large
number of negative elements will have a greater chance of belonging to Class III.
We now describe our analysis which supports this claim.
7This result is easy to prove and we give the proof in section IV.
19
For each subperiod we assign a rank to each stock based on the number of
negative correlations it has with other stocks. For example, during the first sub-
period there are 890 stocks. We counted the number of negative correlations for
each stock and then ranked them from the highest to the lowest. It will be useful
to distinguish stocks with high numbers of negative correlations from other stocks.
We say a that a stock has a Majority of Negative Correlations if is negatively
correlated to more than 50% of the stocks in the sample. For convenience these
stocks are labelled as MNC stocks.
In presenting our results it is convenient to consider the first three subperiods
and the last two sub periods separately. Figure 6 shows that the fraction of
matrices with negative elements is very high during the first three subperiods and
drops a lot during the final two subperiods. Over 99.99% of the 10,000 matrices
in the first three subperiods have some negative correlations while during the last
two subperiods the fraction of the 10,000 matrices with negative elements declined
sharply to 40% in the fourth subperiod and 10% in the final subperiod. We will
explain later why this difference has implications for the relative composition of
Class II and Class III matrices.
We will present the results for the first subperiod in detail and summarize the
results for the second and third subperiods in the Appendix B since they are quite
similar. In the same way we just discuss the detailed results for the final subperiod
and provide the summary for the fourth subperiod in the Appendix B.
Table 5 gives the top ten ranked stocks for the 1990-1993 subperiod. We
also indicate if a stock has the MNC property. This table shows that the top
20
ten ranked stocks have a substantial number of negative correlations. During the
1990-93 period there are ten stocks with the MNC property. We now examine the
distribution of the most common stocks in the Class II and Class III matrices .
We will show that the MNC stocks are very significantly over represented among
the Class III matrices.
Subperiod 1: 1990− 1993
Company name
Number ofnegativecorrela-tions
Rank
ROYAL GOLD INC (MNC) 613 1NEWMONT MINING CORP (MNC) 577 2
FIGGIE INTERNATIONAL INC DEL (MNC) 500 3STAR CLASSICS INC (MNC) 497 4
CEDAR INCOME FUND LTD (MNC) 492 5SANMARK STARDUST INC (MNC) 482 6
GOLDEN CORRAL RLTY CORP (MNC) 457 7INTERNATIONAL POWER MACHS CORP
(MNC)451 8
ALTA ENERGY CORP (MNC) 449 9BANCORP OF MISSISSIPPI INC (MNC) 447 10
Table 5: Top ten ranked stocks from the stock universe for the 1990-1993 period
Table 6 shows the most commonly occurring stocks among Class II and Class
III matrices for 1990-1993. We discuss the Class III matrices first. There are 5216
Class III matrices. The most commonly occurring stocks among the 5216 Class III
matrices are the MNC stocks which are the stocks with the most negative corre-
lations. Recall that over 50% of their correlations with other stocks are negative.
Nine of the most common stocks in the Class III category are MNC stocks. The
21
tenth stock, Telecom Corp is not an MNC stock. However it has 49.76% of its
correlations negative so it just misses the MNC cutoff point. There is a very sig-
nificant direct relation between the most common stocks in the Class III category
and the MNC property
For this subperiod there are 4,784 Class II matrices out of the total 10,000
matrices. The stock which appeared most often in these Class II matrices was
Lance Inc and it appeared 312 times. The next most common stock was Cullen
Frost Bankers Inc which appeared 316 times. Lance Inc has a rank of 412 in
terms of the number of stocks it is negatively correlated with. It has a negative
correlation with 62 out of the 890 stocks. Similarly Cullen Frost Bankers the second
most common stock in the Class II matrices has a rank of 337. It has a negative
correlation with 73 out of the 890 stocks. The point we want to emphasize is that
the most commonly occurring stocks in the Class II matrices are very average in
terms of their ranking. The average of the rankings in column three of the first
panel is 477. This stems from the fact that negative correlations are relatively
common in this subperiod.
22
Class II; 1990-1993 (total count: 4784)
Company NameNumber ofappearances
Rank in theuniverse
LANCE INC 317 412CULLEN FROST BANKERS INC 316 337
BEARINGS INC 316 344AVNET INC 314 860
CINTAS CORP 313 712CINCINNATI BELL INC 313 624PIER 1 IMPORTS INC DE 313 456
TIMKEN COMPANY 312 659BLESSINGS CORP 311 338
ATLAS CONSOLIDATED 310 31
Class III; 1990-1993 (total count: 5216)
Company NameNumber ofappearances
Rank in theuniverse
FIGGIE INTERNATIONAL INC 574 3 (MNC)SANMARK STARDUST INC 574 6 (MNC)
ROYAL GOLD INC 574 1 (MNC)NEWMONT MINING CORP 545 2 (MNC)
BANCORP OF MISSISSIPPI INC 543 10 (MNC)STAR CLASSICS INC 535 4 (MNC)
CEDAR INCOME FUND LTD 490 5 (MNC)INTERNATIONAL POWER MACHS 487 8 (MNC)GOLDEN CORRAL RLTY CORP 474 7 (MNC)
TELECOM CORP 443 12
Table 6: Top ten most frequently occurring stocks among Class II andClass III matrices for the 1990-1993 period
23
Subperiod 5: 2009− 2013
Company name
Number ofnegativecorrela-tions
Rank
FLANIGANS ENTERPRISES INC (MNC) 1083 1SUNLINK HEALTH SYSTEMS INC 137 2
FORT DEARBORN INCOME SECS INC 17 3CANTERBURY PARK HOLDING CORP 11 4
HOMEOWNERS CHOICE INC 9 5SUPREME INDUSTRIES INC 7 6
CORE MOLDING TECHNOLOGIES INC 4 7MAGELLAN PETROLEUM CORP 4 8
PEOPLES UNITED FINANCIAL INC 3 9INVACARE CORP 3 10
Table 7: Top ten ranked stocks from the stock universe for the 2009-2013 period
We now conduct a similar analysis for the last period. Table 7 shows the
top ranked stocks in the period 2009-2013 in terms of their negative correlations.
There is just one MNC stock: Flanigan’s Entreprises Inc8 which has by far the
most negative correlations with other stocks. It has a negative correlation with
75% of the other 1450 stocks during this time period. We confirmed that it has a
negative beta.
We next examine the composition of the 360 Class III and the 755 Class II
matrices for this period (2009-2013). Consider first the Class III matrices. We see
that Flanigan’s Entreprises belongs to every one of the Class III matrices. This is
8Flanigan’s Entreprises operates a chain of restaurants and liquor stores in South
Florida.
24
the defining characteristic of these Class III matrices. Note the other stocks occur
much less frequently since the presence of Flanigan’s Entreprises with its row of
negative correlations that is responsible for the Class III result. The ranking of
the other stocks does not matter if the matrix contains Flanigan’s Entreprises .
The composition of the Class II matrices for 2009-2013 differs sharply from
what we found for the 1990-1993 period. For the earlier period we noted that the
most common stocks in the Class II matrices were average stocks. However the
top panel of Table shows that now the most common stocks are all the highest
ranked stocks with the exception of Flanigan’s Entreprises. We can explain this
difference as follows. In the later period there is a scarcity of negative correlations
and so if a matrix contains a higher ranked stock it is more likely to have negative
elements which places it in Class II or III. However for this period a high ranked
stock on its own is not enough to tip the matrix into Class III unless one of them
is Flanigan’s Entreprises. That is why we now see more higher ranked stocks in
Class II matrices.
Earlier we showed that Class III matrices tended to have a higher number
of negative correlations. The last result shows that the pattern of the negative
elements also matters. We just showed that individual stocks with a large number
of negative correlations with other stocks tended to occur more frequently in Class
III matrices than other stocks. Hence it is not only the number and magnitude
of negative correlations but also their pattern that influences the positivity of the
dominant eigenvector. The pattern of the negative correlations is a critical factor
in our context and we discuss it further in the next section.
25
Class II; 2009-2013 (counts: 755)
NameNumber ofappearance
Rank in theuniverse
SUNLINK HEALTH SYSTEMS INC 339 2FORT DEARBORN INCOME SECS
INC158 3
CANTERBURY PARK HOLDINGCORP
119 4
SUPREME INDUSTRIES INC 94 6HOMEOWNERS CHOICE INC 91 5
CORE MOLDING TECHNOLOGIESINC
52 7
COLUMBIA LABORATORIES INC 50 186DIAMOND FOODS INC 45 13
PEOPLES UNITED FINANCIALINC
44 9
CAREER EDUCATION CORP 42 12
Class III; 2009-2013 (counts: 360)
NameNumber ofappearance
Rank in theuniverse
FLANIGANS ENTERPRISES INC 360 1 (MNC)BLOCK H & R INC 25 406
MEDNAX INC 25 91HILL ROM HOLDINGS INC 24 50
D T E ENERGY CO 22 1146SUPERVALU INC 22 377
FORT DEARBORN INCOME SECSINC
22 3
M S C INDUSTRIAL DIRECT INC 22 792WORLD FUEL SERVICES CORP 21 1270
TYSON FOODS INC 21 652
Table 8: Top ten most frequently occurring stocks among Class II andClass III matrices for the 2009-2013 period
26
IV. Analytical Results
There are two parts in this section. First we review some general results that are
germane to our analysis. Second we derive some specific analytical results that
are better tailored to our applications. The general results provide conditions for
a symmetric matric to have a positive dominant eigenvector. These conditions
are formulated in terms of expressions that involve functions of the matrix. Our
specific results are valid for correlation matrices and are expressed in terms of the
individual correlations.
We start by recalling the basic Perron-Frobenius theorem which states that if
a square matrix has all positive entries, its dominant eigenvector will also have
all positive entries. Tarazaga et al. (2001) obtained an alternative condition for
a symmetric matrix to have a positive dominant eigenvector. Their condition is
satisfied for some matrices that have negative elements. Assume C is an n × n
symmetric matrix. They show that if
e′Ce ≥
√(n− 1)2 + 1
√Trace(C ′C)(1)
then C has a nonnegative dominant eigenvector.
This is just a sufficient condition. There are matrices which do not satisfy this
condition and yet have a strictly positive dominant eigenvector. We checked this
condition for all the 4,784 Class II matrices in our first subperiod. The results
were striking. Every single one of these matrices did not satisfy condition (1) even
though every one of these matrices has a strictly positive dominant eigenvector. We
27
are forced to conclude that this test has no diagnostic value in finance applications
like ours.
Noutsos (2006) provides a more complete characterization. He showed that a
given symmetric matrix C has a strictly positive dominant eigenvector if and only
if there exists a positive integer k0 such that Ck has all its entries positive for all
k ≥ k0. To see this, recall that C and Ck have the same eigenvectors and apply
the Perron-Frobenius result to Ck.
In stark contrast to the previous case, our empirical results are perfectly con-
sistent with this test. Each of the 4,784 Class II matrices from the first subperiod
contains some negative elements and each matrix has a strictly positive dominant
eigenvector. In each case there is an integer k such that Ck contains only positive
elements. The values of k0 ranged from 2 to 12 with a mode of 4. The Noutsos test
discriminates perfectly between Class II and Class III matrices. However, since
the test is framed in terms of the matrix powers it is not easy to relate this test
to the frequency and severity of negative correlations in the underlying matrix.
Our next two propositions take some modest steps in this direction. The first
deals with the case when there is a complete row of negative correlations and the
second deals with the special case when all the correlations are equal.
Proposition One. If a correlation matrix has a complete row of negative
correlations then its dominant eigenvector must have at least one negative
component.
28
We prove this result by contradiction. Assume C has a complete row, say row
i, of negative elements and its dominant eigenvector v is strictly positive. We have
Cv = λv
where λ is the largest eigenvalue. For row i this implies
vi +∑j 6=i
ρijvj = λvi
This last equation can be written as
(λ− 1)vi =∑j 6=i
ρijvj
Since λ > 1 and vi is positive the left hand side is positive. However since all the
components of v are assumed positive and all the ρij are negative the right hand
side is negative. This contradiction proves the result.
We illustrate this Proposition with a numerical example. Consider the two
correlation matrices A and B where
A =
1 −0.04 −0.23 −0.41
−0.04 1 0.8 0.38
−0.23 0.8 1 0.23
−0.41 0.38 0.23 1
,
29
B =
1.00 0.38 −0.23 −0.41
0.38 1.00 0.80 −0.04
−0.23 0.80 1.00 0.23
−0.41 −0.04 0.23 1.00
.
They both have the same correlation elements and differ only in the position
of these elements. The dominant eigenvector of A,
[−0.31, 0.60, 0.60, 0.45]′
contains one negative component.
On the other hand, the dominant eigenvector of B,
[0.06, 0.70, 0.70, .13]′,
is strictly positive.
Hence matrix A belongs to Class III while matrix B with exactly the some
correlation elements belongs to Class II. The key difference is that matrix A has a
complete row of negative correlations. Matrix B does not have the complete row
property. We now turn to Proposition Two.
Proposition Two . If all the correlations are equal to ρ say, then the matrix
has a strictly positive dominant eigenvector if and only ρ > 0.
This result follows directly from Proposition One and the Perron-Frobenius The-
30
orem. Equi-correlation matrices play an important role in the shrinkage estimate
of Ledoit and Wolf(2004). These authors compute the shrinkage matrix Cshrink
as
Cshrink = δChistorical + (1 − δ)Cequicorr
where
• Chistorical is the correlation matrix estimated from the data
• Cequicorr is the corresponding correlation matrix where ρ is the average of
the correlations in Chistorical. For stock returns ρ > 0.
• δ is the weighting parameter which can be optimally estimated.
Boyle (2014) shows that Cshrink typically has a strictly positive eigenvector even
when Chistorical does not. Hence using a shrinkage estimate for the correlation
matrix provides a practical method of obtaining a strictly positive dominant eigen-
vector and hence a positively weighted market factor portfolio.
31
V. Summary
This paper has examined the conditions under which the first principal component
(or market factor portfolio) of stock returns has positive weights. The first princi-
pal component corresponds to the dominant eigenvector of the correlation matrix.
We used empirical returns over the period 1990-2013 to study this question. If all
the input correlations are positive, then the positivity of the dominant eigenvector
is guaranteed by the Perron-Frobenius theorem. However stock returns generally
contain negative correlations. If the negative correlations are relatively insignifi-
cant in terms of their number and magnitude it is likely that the matrix will have
a positive dominant eigenvector. Roughly speaking the dominant eigenvector can
still have all its components positive as long as the negative elements are not too
pervasive.
Our results show that short positions in the market factor portfolio become
more likely as the frequency and the absolute magnitude of negative correlations
increase. The location of the negative correlations is important as well. We saw
that negative correlations tend to occur in rows. Certain stocks, such as mining
stocks tend to be negatively correlated with several other stocks. Proposition One
shows that in the limit when a stock has all its correlations negative, the dominant
eigenvector will have a negative component. Our empirical findings are consistent
with this result. In summary whether or not the first principal components of stock
returns has any short positions depends on the number, magnitude and position
of the negative elements in the correlation matrix.
32
This paper has presented results based on daily returns. In results not re-
ported here we investigated the effects of changing the return sampling frequency
from daily to weekly. We find that there is a decrease in the number of negative
correlations in the data. The heat maps for correlations based on weekly returns
show that negative correlations still tend to cluster in rows and that the main
conclusions of the paper still hold.
33
References
[1] Alexander C., Dimitriu A., “Sources of Outperformance in Equity Mar-
kets”,The Journal of Portfolio Management, 30, 4 (2004), 170-185.
[2] Allez R., and Bouchaud J.P., “Individual and Collective Stock Dynamics:
Intra-day Seasonalities”, New J. Phys., 13 (2011), 025010.
[3] Avellaneda M., and Lee J.H., “Statistical Arbitrage in the US Equities Mar-
ket”. Quantitative Finance, 10 (2010), 1-22.
[4] Billio M., Getmansky M., Lo A. W., and Pelizzon L., “Econometric Masures
of Systemic Risk in the Finance and Insurance Sectors”,National Bureau of
Economic Research, (2010), Working Paper 16223.
[5] Boyle P., “Positive Weights on the Efficient Frontier”, North American Actu-
arial Journal, 18, 4 (2014), 462-477.
[6] Connor G., and Korakjczyk R.A., “A Test for the Number of Factors in an
Approximate Factor Model” Journal of Finance, 48, 4 (1993), 1263-1291.
[7] Frobenius, G. F. 1912. “Uber Matrizen aus Nicht Negativen Elementen” S.-B.
Preuss Acad. Wiss. (Berlin), (1912), 456-477.
[8] Geweke J., and Zhou G., “Measuring the Pricing Error of the Arbitrage Pricing
Theory”, Review of Financial Studies, 9 (1996), 557-587.
[9] Gopikrishnan P., Rosenow B., Plerou V., and Stanley H.E., “Quantifying and
Interpreting Collective Behavior in Financial Market”, Physical Review E, 64
(2001), 035106.
34
[10] Jones C.S., “Extracting Factors from Heteroskedastic Asset Returns”, Journal
of Financial Economics, 62 (2001), 293-325.
[11] Kritzman M., Li Y., Page S., and Rigobon R., “Principal Components as a
Measure of Systemic Risk”,The Journal of Portfolio Management, 37, 4 (2011),
112-126.
[12] Laloux L., Cizeau P., and Bouchaud J-.P., “Noise Dressing of Financial Cor-
relation Matrices”, Physical Review Letters, 83,7 (1999), 1467-1469.
[13] Merville L. J., and Xu Y., “The Changing Factor Structure of Equity Re-
turns”. The Journal of Portfolio Management, Summer (2001), 51-61.
[14] Noutsos D., “On Perron-Frobenius Properties of Matrices Having Some Neg-
ative Entries”,Linear Algebra and Its Applications, 412 (2006), 132-153.
[15] Pelger M., “Large-Dimensional Factor Modeling Based on High-Frequency
Observations”, (2015), Working Paper, University of California, Berkeley.
[16] Perron O., “Zur Theorie der Matrizen”, Mathematische Annalen, 64 (1907),
248-263.
[17] Ross S.A., “The Arbitrage Theory of Capital Asset Pricing” , Journal of
Economic Theory, 13, 3 (1976), 341-360.
[18] Sharpe W. F., “A Simplified Model for Portfolio Analysis”, Management Sci-
ence, 9, 2 (1963), 277-293.
[19] Tarazaga P., Raydon M., and Hurman A., “Perron-Frobenius Theorem for
Matrices With Some Negative Entries”, Linear Algebra and its Applications,
328 (2001), 57-68.
35
[20] Trzcinka C., “On the Number of Factors in the Arbitrage Pricing Model”,
Journal of Finance, 41, 2 (1986), 347-368.
36
A.
This Appendix compares the empirical distribution of the negative elements of
Class II and Class III matrices for the five subperiods.
Figure 7: Empirical distribution of negative correlations among ClassII and Class III matrices for the 1990-1993 period
size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0
frequ
ency
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2Empirical distribution of negative correlations (1990-1993)
Class IIIClass II
37
Figure 8: Empirical distribution of negative correlations among ClassII and Class III matrices for the 1994-1998 period
size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0
frequency
0
0.05
0.1
0.15
0.2
0.25Empirical distribution of negative correlations (1994-1998)
Class IIIClass II
Figure 9: Empirical distribution of negative correlations among ClassII and Class III matrices for the 1999-2003 period
size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0
frequency
0
0.05
0.1
0.15
0.2
0.25Empirical distribution of negative correlations (1999-2003)
Class IIIClass II
38
Figure 10: Empirical distribution of negative correlations among ClassII and Class III matrices for the 2004-2008 period
size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0
frequency
0
0.05
0.1
0.15
0.2
0.25Empirical distribution of negative correlations (2004-2008)
Class IIIClass II
Figure 11: Empirical distribution of negative correlations among ClassII and Class III matrices for the 2009-2013 period
size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0
frequency
0
0.05
0.1
0.15
0.2
0.25
0.3Empirical distribution of negative correlations (2009-2013)
Class IIIClass II
39
B.
This Appendix extends the analysis of the pattern of the negative elements in the
correlation matrices in Class II and Class III matrices to the other subperiods.
In Section III we analyzed the first (1990-1993) and last (2009-2013) subperiods.
The results for 1994-1998 and 1999-2003 are very similar to those we obtained in
Section III for 1990-1993. This can be confirmed by comparing Table 5 in Section
III to Table 12 and Table 13 below. The results for 2004-2009 are very similar to
those we obtained in Section III for 2009-2013. This can be verified by comparing
Table 8 in Section III to Table 14 below.
Subperiod 2: 1994− 1998
Company name
Number ofnegativecorrela-tions
Rank
COLONIAL COMMERCIAL CORP (MNC) 609 1HELMSTAR GROUP INC (MNC) 541 2
LUXTEC CORP (MNC) 527 3CEDAR INCOME FUND LTD 514 4
BIO REFERENCE LABORATORIES INC 513 5POWELL INDUSTRIES INC 467 6
HEALTHCARE IMAGING SERVICES INC 443 7SAHARA GAMING CORP 398 8
HEIST C H CORP 393 9CARETENDERS HEALTH CORP 379 10
Table 9: Top ten ranked stocks from the stock universe for the 1994-1998 period
40
Subperiod 3: 1999− 2003
Company name
Number ofnegativecorrela-tions
Rank
ROYAL GOLD INC (MNC) 1019 1NEWMONT MINING CORP (MNC) 914 2MORGANS FOODS INC (MNC) 645 3
DARLING INTERNATIONAL INC (MNC) 610 4MILESTONE SCIENTIFIC INC (MNC) 599 5FORT DEARBORN INCOME SECS INC 543 6
SENECA FOODS CORP NEW 506 7DELPHI INFORMATION SYSTEMS INC 489 8
INTEVAC INC 481 9AMERICAN SCIENCE & ENGR INC 465 10
Table 10: Top ten ranked stocks from the stock universe for the 1999-2003 period
Subperiod 4: 2004− 2008
Company name
Number ofnegativecorrela-tions
Rank
FLANIGANS ENTERPRISES INC (MNC) 1031 1MOVIE STAR INC N Y (MNC) 910 2
CONVERA CORP 321 3CANTERBURY PARK HOLDING CORP 192 4
STEPHAN CO 191 5SUNLINK HEALTH SYSTEMS INC 176 6WINLAND ELECTRONICS INC 129 7
CORE MOLDING TECHNOLOGIES INC 66 8HEALTHSTREAM INC 59 9
SENECA FOODS CORP NEW 54 10
Table 11: Top ten ranked stocks from the stock universe for the 2004-2008 period
41
Class II; 1994-1998 (counts: 8440)
NameNumber ofappearance
Rank in theuniverse
F M C CORP 475 903ON ASSIGNMENT INC 470 222COORS ADOLPH CO 467 103
COMMERCE BANCSHARES INC 459 570CLEAN HARBORS INC 458 34AVON PRODUCTS INC 458 738TYSON FOODS INC 458 381NATIONAL HEALTHLABORATORIES INC
456 229
VARIAN ASSOCIATES INC 456 941ALLTRISTA CORP 456 350
Class III; 1994-1998 (counts: 1560)
NameNumber ofappearance
Rank in theuniverse
COLONIAL COMMERCIAL CORP 442 1 (MNC)HELMSTAR GROUP INC 342 2 (MNC)
BIO REFERENCE LABORATORIESINC
316 5
LUXTEC CORP 315 3 (MNC)POWELL INDUSTRIES INC 186 6CEDAR INCOME FUND LTD 169 4
DELTA AIR LINES INC 100 1014MICRON TECHNOLOGY INC 100 408
QUIKSILVER INC 100 881BERKLEY W R CORP 99 373
Table 12: Top ten most frequently occurring stocks among Class II andClass III matrices for the 1994-1998 period
42
Class II; 1999-2003 (counts: 8117)
NameNumber ofappearance
Rank in theuniverse
SUPERIOR INDUSTRIES INTL INC 404 1086MERCURY COMPUTER SYSTEMS 400 185BIO RAD LABORATORIES INC 399 803
P E C O ENERGY CO 398 85SILGAN HOLDINGS INC 398 179S V I HOLDINGS INC 397 35SCANSOURCE INC 396 1062
CABLEVISION SYSTEMS CORP 395 1096PARKWAY PROPERTIES INC 394 726
INTERFACE INC 393 921
Class III; 1999-2003 (counts: 1783)
NameNumber ofappearance
Rank in theuniverse
NEWMONT MINING CORP 445 2 (MNC)ROYAL GOLD INC 422 1 (MNC)
MORGANS FOODS INC 358 3 (MNC)DARLING INTERNATIONAL INC 304 4 (MNC)MILESTONE SCIENTIFIC INC 259 5 (MNC)
FORT DEARBORN INCOME SECSINC
160 6
AMERICAN SCIENCE & ENGR INC 148 10DELPHI INFORMATION SYSTEMS
INC116 8
SENECA FOODS CORP NEW 112 7INTEVAC INC 111 9
Table 13: Top ten most frequently occurring stocks among Class II andClass III matrices for the 1999-2003 period
43
Class II; 2004-2008 (counts: 3368)
NameNumber ofappearance
Rank in theuniverse
CANTERBURY PARK HOLDINGCORP
371 4
CONVERA CORP 355 3WINLAND ELECTRONICS INC 340 7
HEALTHSTREAM INC 337 9MAXXAM INC 328 11STEPHAN CO 324 5
SUNLINK HEALTH SYSTEMS INC 322 6CORE MOLDING TECHNOLOGIES
INC319 8
SENECA FOODS CORP NEW 316 10CORTEX PHARMACEUTICALS
INC315 13
Class III; 2004-2008 (counts: 784)
NameNumber ofappearance
Rank in theuniverse
MOVIE STAR INC N Y 417 2 (MNC)FLANIGANS ENTERPRISES INC 383 1 (MNC)MOLINA HEALTHCARE INC 50 49
AUTONATION INC DEL 46 853P C TEL INC 45 1012
CAL MAINE FOODS INC 44 299BLOCK H & R INC 44 367REX STORES CORP 44 68
JARDEN CORP 44 894CLARCOR INC 43 678
Table 14: Top ten most frequently occurring stocks among Class II andClass III matrices for the 2004-2008 period
44