When do portfolios based on the rst principal component ... · The market portfolio corresponds to the value weighted portfolio where the weight of each stock is proportional to its

When do portfolios based on the first

principal component have short positions?

Abstract

The first principal component of stock returns is often identified with the mar-

ket factor. Empirical portfolios based on this principal component sometimes

contain short positions. We analyze how the stock return correlations affect the

positivity of the dominant principal component. If all the correlations are positive

this portfolio has positive weights. In practice some of the correlations can be

negative and in this case the weights on the first principal component may or may

contain negative values. We determine the characteristics of the correlation matrix

that lead to negative weights in the first principal component.

1

I. Introduction

The intuition that security returns can be explained by a set of common factors

is well entrenched in the finance literature. Sharpe (1963) assumed that a single

factor could explain the systemic component of returns. Ross (1976) assumed that

returns were generated by a set of common factors. These factors are derived from

the covariance matrix where the most important factors are associated with the

largest eigenvalues. The single most influential factor corresponds to the largest or

dominant eigenvalue. The factor loadings are obtained from the associated eigen-

vector. This factor represents the linear combination of securities that explains the

largest fraction of the total variance. Typically it is identified as the market factor.

Several1 papers discuss the connection between the first principal component of

stock returns and the market portfolio.

The market portfolio corresponds to the value weighted portfolio where the

weight of each stock is proportional to its market capitalization. In this case all

the portfolio weights are positive. To avoid confusion we will refer to the portfolio

constructed from the first principal component as the market factor portfolio. It is

well documented2 that the returns on the these two portfolios are highly correlated

11These papers include Trzcinka (1986), Connor and Korakjczk (1993), Geweke and

Zhou (1996), Jones (2001), and Merville and Xu (2002). See also Avellaneda and Lee

(2010), Laloux et al. (1999) Gopikrishnan et al. (2001) and Allez and Bouchaud (2011)

and Pelger (2015) .

2See for example Avellaneda and Lee (2010). They show that that the returns on the

market factor portfolio are similar to those of the capitalization weighted portfolio.

2

although in general the weights in these two portfolios will differ. It is now common

to compute the principal components of the correlation matrix rather than the

covariance matrix and we follow this convention.

The market factor portfolio has practical applications in investment and risk

management. It can be used to implement investment strategies. For example,

Avellaneda and Lee (2010) discuss its role and the role of the other factor portfolios

in statistical arbitrage. Alexander and Dimitriu (2004) find that investing in the

market factor portfolio outperforms both equally weighted and value weighted

strategies. Boyle (2014) finds that the market factor portfolio dominates the value

weighted portfolio in terms of Sharpe ratio and that its performance is comparable

with the equally weighted portfolio. Principal components have also been used to

measure systemic risk. Billio, Getmansky, Lo and Pelizzon (2010) and Kritzman,

Li, Page, and Rigobon (2011) describe how the principal components can be used

to monitor systemic risk.

Because the principal components are orthogonal there can be at most one of

them with strictly positive components. Typically this is the first principal com-

ponent corresponding to the dominant eigenvector. However in practice the first

principal component does not always have positive components. Avellaneda and

Lee (2010) observe that empirically based market factor portfolios can sometimes

have a few negative weights.

There are three reasons why market factor portfolios with all positive weights

are of interest. First, many classes of investor are restricted to long only portfolios.

Long only portfolios are simpler to construct in practice since taking short positions

3

involves institutional constraints. Second, if we want the market factor portfolio

to be used as a proxy3 for the CAPM market portfolio it should certainly have

positive weights. Finally, by knowing why the portfolio does not have positive

weights we may better understand the relevant characteristics of the correlation

matrix. This paper explores conditions under which the market factor portfolio

has positive weights. We establish the connection between the properties of the

correlation matrix and the existence of a market factor portfolio with positive

weights.

There is one situation when the relation is clear cut. If all the correlation

coefficients are positive then all the weights in the market factor portfolio will be

positive. This follows from the classic Perron-Frobenius 4 result, which states that

if a matrix has all its elements positive, the dominant eigenvector will also have all

its components positive. Hence if the dominant eigenvector has any negative com-

ponents then some of the elements in the correlation matrix must be negative. If

the correlation matrix has negative elements we find, based on our empirical work,

that there are two possible outcomes. In some cases the dominant eigenvectors of

such matrices are strictly positive. In other cases, the dominant eigenvectors do

not have this property.

We can distinguish three different classes of correlation matrices.

• Class I. The elements of the correlation matrix are all positive. In this

3Boyle (2014) proposes such a model where the market factor portflio is used as a

candidate for the CAPM market portfolio

4See Perron (1907) and Frobenius (1912).

4

case all the components of the dominant eigenvector are positive.

• Class II. Some of the elements in the correlation matrix are negative and

all the components of the dominant eigenvector are positive.

• Class III. Some of the elements in the correlation matrix are negative and

the dominant eigenvector has at least one negative component.

We now illustrate these three cases with three correlation matrices. First

consider C1 which has all its correlations positive. Let λ1 denote the vector of

eigenvalues and let v1 be the eigenvector associated with the largest eigenvalue of

λ1. We have

C1 =

1.00 0.50 0.45

0.50 1.00 0.40

0.45 0.40 1.00

λ1 =

0.49

0.61

1.90

v1 =

0.60

0.58

0.56

Since all the eigenvalues are positive, C1 is positive definite and is thus a valid

correlation matrix. All the components of v1 are positive. This is consistent with

the Perron-Frobenius theorem. Hence C1 belongs to Class I.

Next consider a correlation matrix , C2 with two negative elements.

C2 =

1.00 0.50 0.45

0.50 1.00 −0.40

0.45 −0.40 1.00

λ2 =

0.10

1.39

1.51

v2 =

0.78

0.59

0.23

5

All the components of the dominant eigenvector of C2 are positive. Hence

C2 generates a market factor portfolio with all its weights positive. This matrix

belongs to Class II.

The next correlation matrix, C3 looks similar to C2. It also has two negative

elements.

C3 =

1.00 0.50 −0.45

0.50 1.00 0.40

−0.45 0.40 1.00

λ3 =

0.10

1.39

1.51

v3 =

0.78

0.59

−0.23

However the dominant eigenvector of C3 has a negative component. In our context

C3 would generate a market factor portfolio with a negative weight on the third

asset. Hence C3 belongs to Class III.

In this paper we use empirical data to explore the connection between the

properties of the correlation matrix and the positivity of the dominant eigenvec-

tor. We compute return correlations using stocks from the S&P 1500 between

1990 and 2013. The period is divided into five subperiods. We analyze the fre-

quency, magnitude and patterns of negative correlations in this data over different

subperiods. We find that the frequency of negative correlations has declined over

time. For each period we construct batches of empirical correlation matrices and

analyze the signs of the dominant eigenvectors.

We find examples of each of the three Classes in the empirical data. The data

contains matrices where all the correlations are positive corresponding to Class

I. It also includes correlation matrices with some negative elements that belong

6

to Class II. In other words the matrix has some negative correlations and all

the components of the dominant eigenvector are positive. The data also contains

correlation matrices with some negative elements that belong to Class III. For these

matrices there is at least one negative component in the dominant eigenvector. We

find that in the early part of period Class II and III dominate but that Class I

matrices become more common toward the end of the period. Furthermore if

we just examine matrices with some negative correlations, we find that Class II

matrices become relatively more important than Class III matrices over the 24 year

period. We also document that negative correlations typically line up together in

rows reflecting the fact that certain stock are negatively correlated with a group

of other stocks.

The main finding is that the frequency, size and pattern of the negative ele-

ments determine whether or not the dominant eigenvalue has any negative com-

ponents. Broadly speaking the more pervasive the negative correlations the more

likely it is that there will be short positions. We also examined the relationship

between the correlation matrix and the existence of a strictly positive dominant

eigenvector using analytical methods. These results help us understand the intu-

ition behind some of our empirical findings. For example, if a stock is negatively

correlated with all the other stocks we can prove that it will have a negative weight

in the market factor portfolio.

The rest of the paper is organized as follows. The next section contains our

empirical analyses of stock return correlations. Section III explores the connection

between the properties of the correlation matrix and the signs of the weights in the

7

dominant eigenvector. Section IV contains our analytical results while the final

section summarizes the paper.

8

II. Empirical Analysis: Correlations

In this section we describe our empirical analysis of stock return data. We examine

the correlation matrix of stock returns over a 24 year recent period. We provide

details of the distribution of negative correlations in these matrices in terms of

their frequency and also of their patterns. We computed daily returns for the

components of the S&P1500 index between January 1, 1990 to December 31, 2013.

This index is comprised of 500 US large cap stocks, 400 US mid cap stocks and

600 US small cap stocks.

We divided the time period into five non-overlapping subperiods. Table 1 gives

details of the sub periods and the number of stocks used in each subperiod.

Period Duration of period Number of Stocks1 01/01/1990 - 31/12/1993 8902 01/01/1994 - 31/12/1998 10303 01/01/1999 - 31/12/2003 11884 01/01/2004 - 31/12/2008 12975 01/01/2009 - 31/12/2013 1450

Table 1: Summary of time periods and sample sizes for stock returndata

We first examine how the frequency of negative correlations has varied over

time across the five periods. We computed the pairwise correlations of all the

stocks in our sample for each period. For the first sub period, 1990-1993 we find

that 10.75% of all the correlations are negative. This percentage is computed

as follows. There are 890 stocks in this subperiod so this gives . 890 × 890 =

792, 100 correlations, of these 85,182 were negative so the percentage of negative

9

correlations for the 1990-1993 subperiod 10.75. The results are given in Table 2.

Subperiod Dates Percent of negative correlationsin the sample

for this subperiod1 1990− 1993 10.752 1994− 1998 4.173 1999− 2003 3.124 2004− 2008 0.395 2009− 2013 0.12

Table 2: Summary of frequency of negative correlations across the fivetime periods

It is clear that the frequency of negative correlations declined significantly

over the sample period. During (1900 − 1993) over ten percent of the correlations

were negative whereas during the last subperiod, (2009 − 2013) the percentage of

negative correlations had shrunk to 0.12.

We next examine the location of these negative correlations to discern if there is

any structure. For our purposes it is very convenient to use correlation heat maps

to summarize the pattern of a large number of correlation coefficients. In these

heat maps we use different colours to represent the magnitude of the correlations.

Light green represents a correlation of 0.8. As the magnitude of the correlation

decreases the colour becomes progressively darker. If the correlation is negative

it is represented by red. Figures 1 through 5 contain the heat maps for the five

subperiods.

There are two striking features of these heat maps. First the negative correla-

tions tend to appear in rows(and of course in columns because of symmetry). This

is most evident in Figure 1 where we observe pronounced horizontal and vertical

10

red lines but the same pattern is observed in all the other heat maps. The second

feature is that the heat maps become progressively lighter over time which means

that correlations have steadily increased over time during this period.

Figure 1: Correlation heatmap of the stock universe for 1990-1993.

Daily Return Correlation sp1500_90_93

0.0

0.2

0.4

0.6

0.8

11



0.0

0.2

0.4

0.6

0.8



0.0

0.2

0.4

0.6

0.8

12



0.0

0.2

0.4

0.6

0.8



0.0

0.2

0.4

0.6

0.8

13

The tendency for negative correlations to appear in rows reflects the fact that

certain stocks (for example gold stocks) tend to have negative correlations with a

lot of other stocks. We shall show in the next Section that if a correlation matrix

contains several negative correlations in any of its rows, this generally increases the

likelihood that the matrix will belong to Class III. In other words rows of negative

of correlations typically lead to short positions in the market factor portfolio.

14

III. Characteristics of the correlation matrix

causing short positions

In this section we use our empirical correlation data to compute market factor port-

folios. We investigate the extent to which the market factor portfolios have strictly

positive weights and analyze the circumstances which lead to short positions in the

market factor portfolios. We show that empirical correlation matrices with some

negative elements can sometimes produce market factor portfolios with all long po-

sitions. However empirical correlation matrices with some negative elements can

also produce market factor portfolios with some short positions. We investigate

which features of the correlation matrix lead to this difference. We show that the

extent to which negative correlations occur in rows has an important impact on

the signs of the dominant eigenvector.

In our analysis the market factor portfolio weights are obtained from the prin-

cipal eigenvalue of the correlation matrix. If C is the correlation matrix and λ is

the principal eigenvalue with corresponding eigenvector v, we have

Cv = λv

The weights on the market factor portfolio, w, are obtained by normalizing the

vector v so that its components sum to one

w =v

v′e.

15

Here e denotes the unit vector.

To explore the connection between the properties of the input correlation ma-

trix and the sign of the dominant eigenvector we used the following approach.

For each subperiod, we generated 10,000 random samples of 50 stocks each and

obtained the sample correlation matrix based on these 50 stocks. There are three

reasons for using this approach. First it ensures that all our correlation matrices

are positive5 definite. Second this approach enables us to have a much richer spec-

trum of input correlation matrices to analyze. Third, the correlations are realistic

in that they come from empirical data.

For each of the 10,000 sample matrices in each subperiod we determine6 whether

it belongs to Class I, Class II or Class III. Recall that Class I matrices contain

only positive correlations and so they automatically have a strictly positive domi-

nant eigenvector. Class II matrices contain some negative elements but also have

a strictly positive dominant eigenvector. Class III matrices contain some negative

elements and do not have a strictly positive dominant eigenvector. The breakdown

of the matrices among these three Classes for our different subperiods is shown in

Figure 6.

Note that the number of Class I matrices is negligible for the first three subpe-

riods. However they dominate in the last two subperiods. This is what we would

expect from the distribution of negative elements in Table 2. During the first

three periods the majority of the 10,000 matrices have some negative elements.

5This would not be case if we used the entire population.

6This is an easy operation in matlab using the eig function.

16

During the first subperiod these matrices are split almost evenly between Class II

and Class III but as time passes Class II matrices become more relatively more

frequent than Class III matrices. Class I and II matrices give rise to market factor

portfolios with positive weights whereas Class III matrices do not. We are inter-

ested in the relevant properties of a correlation matrix with negative elements that

determine whether it is in Class II or Class III.

Figure 6: Breakdown of matrices by Class

17

We compare the distribution of negative correlations in Class II and Class III

matrices. We compute the average number of negative correlations in these two

Classes for our five subperiods. The results are given in Table 3 which shows

that Class III matrices have consistently more negative correlations than Class II

matrices. In the later periods the percentage of negative elements in both Class II

and Class III matrices declines but the relative decline is much greater for Class II

matrices By comparing Table 3 with Table 2, we see that the average percentage

of negative elements in the Class II matrices is very similar to the frequency of

negative elements in the underlying population. The Class II matrices in Table 3

are average in terms of their share of negative elements while Class III matrices

are exceptional in that they contain more than their share of negative elements.

We also compare the average magnitudes of the negative correlations for the

these two Classes. The results are given in Table 4. From this Table we see that

the absolute value of the average of the negative correlations is greater for Class

III matrices than for Class II matrices.

Average percentage of Average percentage ofSubperiod negative correlations negative correlations

in Class II matrices in Class III matrices1 9.32 11.752 3.73 5.863 2.57 5.214 0.42 3.135 0.22 2.95

Table 3: Average percentages of negative correlations in Class II andClass III matrices for different subperiods.

In summary Class III correlation matrices tend to have more negative elements

18

Average value of Average value ofSubperiod negative correlations negative correlations

in Class II matrices in Class III matrices1 -0.019 -0.0212 -0.016 -0.0183 -0.018 -0.0274 -0.018 -0.0445 -0.013 -0.031

Table 4: Average values of negative correlations in Class II and ClassIII matrices for different subperiods.

than Class II matrices and also the average absolute size of the negative elements

is larger for Class III matrices. The distributions of the negative elements for the

Class II and Class III matrices can be compared from the plots provided in the

Appendix A.

There is another aspect of the negative correlations that is different between

Class II and Class III matrices. This has to do with the location of the negative

correlations in the empirical data. We have seen from the heat maps that negative

correlations are not randomly distributed but that they tend to occur in rows. In

the case of some stocks a high proportion of their correlations with other stocks

are negative. There is a useful theoretical result7 which holds in the extreme case

when all the correlations in a given row are negative. If a correlation matrix has

a complete row of negative elements then it must belong to Class III. This last

result predicts that correlation matrices which have a row with a relatively large

number of negative elements will have a greater chance of belonging to Class III.

We now describe our analysis which supports this claim.

7This result is easy to prove and we give the proof in section IV.

19

For each subperiod we assign a rank to each stock based on the number of

negative correlations it has with other stocks. For example, during the first sub-

period there are 890 stocks. We counted the number of negative correlations for

each stock and then ranked them from the highest to the lowest. It will be useful

to distinguish stocks with high numbers of negative correlations from other stocks.

We say a that a stock has a Majority of Negative Correlations if is negatively

correlated to more than 50% of the stocks in the sample. For convenience these

stocks are labelled as MNC stocks.

In presenting our results it is convenient to consider the first three subperiods

and the last two sub periods separately. Figure 6 shows that the fraction of

matrices with negative elements is very high during the first three subperiods and

drops a lot during the final two subperiods. Over 99.99% of the 10,000 matrices

in the first three subperiods have some negative correlations while during the last

two subperiods the fraction of the 10,000 matrices with negative elements declined

sharply to 40% in the fourth subperiod and 10% in the final subperiod. We will

explain later why this difference has implications for the relative composition of

Class II and Class III matrices.

We will present the results for the first subperiod in detail and summarize the

results for the second and third subperiods in the Appendix B since they are quite

similar. In the same way we just discuss the detailed results for the final subperiod

and provide the summary for the fourth subperiod in the Appendix B.

Table 5 gives the top ten ranked stocks for the 1990-1993 subperiod. We

also indicate if a stock has the MNC property. This table shows that the top

20

ten ranked stocks have a substantial number of negative correlations. During the

1990-93 period there are ten stocks with the MNC property. We now examine the

distribution of the most common stocks in the Class II and Class III matrices .

We will show that the MNC stocks are very significantly over represented among

the Class III matrices.

Subperiod 1: 1990− 1993

Company name

Number ofnegativecorrela-tions

Rank

ROYAL GOLD INC (MNC) 613 1NEWMONT MINING CORP (MNC) 577 2

FIGGIE INTERNATIONAL INC DEL (MNC) 500 3STAR CLASSICS INC (MNC) 497 4

CEDAR INCOME FUND LTD (MNC) 492 5SANMARK STARDUST INC (MNC) 482 6

GOLDEN CORRAL RLTY CORP (MNC) 457 7INTERNATIONAL POWER MACHS CORP

(MNC)451 8

ALTA ENERGY CORP (MNC) 449 9BANCORP OF MISSISSIPPI INC (MNC) 447 10

Table 5: Top ten ranked stocks from the stock universe for the 1990-1993 period

Table 6 shows the most commonly occurring stocks among Class II and Class

III matrices for 1990-1993. We discuss the Class III matrices first. There are 5216

Class III matrices. The most commonly occurring stocks among the 5216 Class III

matrices are the MNC stocks which are the stocks with the most negative corre-

lations. Recall that over 50% of their correlations with other stocks are negative.

Nine of the most common stocks in the Class III category are MNC stocks. The

21

tenth stock, Telecom Corp is not an MNC stock. However it has 49.76% of its

correlations negative so it just misses the MNC cutoff point. There is a very sig-

nificant direct relation between the most common stocks in the Class III category

and the MNC property

For this subperiod there are 4,784 Class II matrices out of the total 10,000

matrices. The stock which appeared most often in these Class II matrices was

Lance Inc and it appeared 312 times. The next most common stock was Cullen

Frost Bankers Inc which appeared 316 times. Lance Inc has a rank of 412 in

terms of the number of stocks it is negatively correlated with. It has a negative

correlation with 62 out of the 890 stocks. Similarly Cullen Frost Bankers the second

most common stock in the Class II matrices has a rank of 337. It has a negative

correlation with 73 out of the 890 stocks. The point we want to emphasize is that

the most commonly occurring stocks in the Class II matrices are very average in

terms of their ranking. The average of the rankings in column three of the first

panel is 477. This stems from the fact that negative correlations are relatively

common in this subperiod.

22

Class II; 1990-1993 (total count: 4784)

Company NameNumber ofappearances

Rank in theuniverse

LANCE INC 317 412CULLEN FROST BANKERS INC 316 337

BEARINGS INC 316 344AVNET INC 314 860

CINTAS CORP 313 712CINCINNATI BELL INC 313 624PIER 1 IMPORTS INC DE 313 456

TIMKEN COMPANY 312 659BLESSINGS CORP 311 338

ATLAS CONSOLIDATED 310 31

Class III; 1990-1993 (total count: 5216)

Company NameNumber ofappearances

Rank in theuniverse

FIGGIE INTERNATIONAL INC 574 3 (MNC)SANMARK STARDUST INC 574 6 (MNC)

ROYAL GOLD INC 574 1 (MNC)NEWMONT MINING CORP 545 2 (MNC)

BANCORP OF MISSISSIPPI INC 543 10 (MNC)STAR CLASSICS INC 535 4 (MNC)

CEDAR INCOME FUND LTD 490 5 (MNC)INTERNATIONAL POWER MACHS 487 8 (MNC)GOLDEN CORRAL RLTY CORP 474 7 (MNC)

TELECOM CORP 443 12

Table 6: Top ten most frequently occurring stocks among Class II andClass III matrices for the 1990-1993 period

23

Subperiod 5: 2009− 2013

Company name


Rank

FLANIGANS ENTERPRISES INC (MNC) 1083 1SUNLINK HEALTH SYSTEMS INC 137 2

FORT DEARBORN INCOME SECS INC 17 3CANTERBURY PARK HOLDING CORP 11 4

HOMEOWNERS CHOICE INC 9 5SUPREME INDUSTRIES INC 7 6

CORE MOLDING TECHNOLOGIES INC 4 7MAGELLAN PETROLEUM CORP 4 8

PEOPLES UNITED FINANCIAL INC 3 9INVACARE CORP 3 10


We now conduct a similar analysis for the last period. Table 7 shows the

top ranked stocks in the period 2009-2013 in terms of their negative correlations.

There is just one MNC stock: Flanigan’s Entreprises Inc8 which has by far the

most negative correlations with other stocks. It has a negative correlation with

75% of the other 1450 stocks during this time period. We confirmed that it has a

negative beta.

We next examine the composition of the 360 Class III and the 755 Class II

matrices for this period (2009-2013). Consider first the Class III matrices. We see

that Flanigan’s Entreprises belongs to every one of the Class III matrices. This is

8Flanigan’s Entreprises operates a chain of restaurants and liquor stores in South

Florida.

24

the defining characteristic of these Class III matrices. Note the other stocks occur

much less frequently since the presence of Flanigan’s Entreprises with its row of

negative correlations that is responsible for the Class III result. The ranking of

the other stocks does not matter if the matrix contains Flanigan’s Entreprises .

The composition of the Class II matrices for 2009-2013 differs sharply from

what we found for the 1990-1993 period. For the earlier period we noted that the

most common stocks in the Class II matrices were average stocks. However the

top panel of Table shows that now the most common stocks are all the highest

ranked stocks with the exception of Flanigan’s Entreprises. We can explain this

difference as follows. In the later period there is a scarcity of negative correlations

and so if a matrix contains a higher ranked stock it is more likely to have negative

elements which places it in Class II or III. However for this period a high ranked

stock on its own is not enough to tip the matrix into Class III unless one of them

is Flanigan’s Entreprises. That is why we now see more higher ranked stocks in

Class II matrices.

Earlier we showed that Class III matrices tended to have a higher number

of negative correlations. The last result shows that the pattern of the negative

elements also matters. We just showed that individual stocks with a large number

of negative correlations with other stocks tended to occur more frequently in Class

III matrices than other stocks. Hence it is not only the number and magnitude

of negative correlations but also their pattern that influences the positivity of the

dominant eigenvector. The pattern of the negative correlations is a critical factor

in our context and we discuss it further in the next section.

25

Class II; 2009-2013 (counts: 755)

NameNumber ofappearance

Rank in theuniverse

SUNLINK HEALTH SYSTEMS INC 339 2FORT DEARBORN INCOME SECS

INC158 3

CANTERBURY PARK HOLDINGCORP

119 4

SUPREME INDUSTRIES INC 94 6HOMEOWNERS CHOICE INC 91 5

CORE MOLDING TECHNOLOGIESINC

52 7

COLUMBIA LABORATORIES INC 50 186DIAMOND FOODS INC 45 13

PEOPLES UNITED FINANCIALINC

44 9

CAREER EDUCATION CORP 42 12

Class III; 2009-2013 (counts: 360)


Rank in theuniverse

FLANIGANS ENTERPRISES INC 360 1 (MNC)BLOCK H & R INC 25 406

MEDNAX INC 25 91HILL ROM HOLDINGS INC 24 50

D T E ENERGY CO 22 1146SUPERVALU INC 22 377

FORT DEARBORN INCOME SECSINC

22 3

M S C INDUSTRIAL DIRECT INC 22 792WORLD FUEL SERVICES CORP 21 1270

TYSON FOODS INC 21 652


26

IV. Analytical Results

There are two parts in this section. First we review some general results that are

germane to our analysis. Second we derive some specific analytical results that

are better tailored to our applications. The general results provide conditions for

a symmetric matric to have a positive dominant eigenvector. These conditions

are formulated in terms of expressions that involve functions of the matrix. Our

specific results are valid for correlation matrices and are expressed in terms of the

individual correlations.

We start by recalling the basic Perron-Frobenius theorem which states that if

a square matrix has all positive entries, its dominant eigenvector will also have

all positive entries. Tarazaga et al. (2001) obtained an alternative condition for

a symmetric matrix to have a positive dominant eigenvector. Their condition is

satisfied for some matrices that have negative elements. Assume C is an n × n

symmetric matrix. They show that if

e′Ce ≥

√(n− 1)2 + 1

√Trace(C ′C)(1)

then C has a nonnegative dominant eigenvector.

This is just a sufficient condition. There are matrices which do not satisfy this

condition and yet have a strictly positive dominant eigenvector. We checked this

condition for all the 4,784 Class II matrices in our first subperiod. The results

were striking. Every single one of these matrices did not satisfy condition (1) even

though every one of these matrices has a strictly positive dominant eigenvector. We

27

are forced to conclude that this test has no diagnostic value in finance applications

like ours.

Noutsos (2006) provides a more complete characterization. He showed that a

given symmetric matrix C has a strictly positive dominant eigenvector if and only

if there exists a positive integer k0 such that Ck has all its entries positive for all

k ≥ k0. To see this, recall that C and Ck have the same eigenvectors and apply

the Perron-Frobenius result to Ck.

In stark contrast to the previous case, our empirical results are perfectly con-

sistent with this test. Each of the 4,784 Class II matrices from the first subperiod

contains some negative elements and each matrix has a strictly positive dominant

eigenvector. In each case there is an integer k such that Ck contains only positive

elements. The values of k0 ranged from 2 to 12 with a mode of 4. The Noutsos test

discriminates perfectly between Class II and Class III matrices. However, since

the test is framed in terms of the matrix powers it is not easy to relate this test

to the frequency and severity of negative correlations in the underlying matrix.

Our next two propositions take some modest steps in this direction. The first

deals with the case when there is a complete row of negative correlations and the

second deals with the special case when all the correlations are equal.

Proposition One. If a correlation matrix has a complete row of negative

correlations then its dominant eigenvector must have at least one negative

component.

28

We prove this result by contradiction. Assume C has a complete row, say row

i, of negative elements and its dominant eigenvector v is strictly positive. We have

Cv = λv

where λ is the largest eigenvalue. For row i this implies

vi +∑j 6=i

ρijvj = λvi

This last equation can be written as

(λ− 1)vi =∑j 6=i

ρijvj

Since λ > 1 and vi is positive the left hand side is positive. However since all the

components of v are assumed positive and all the ρij are negative the right hand

side is negative. This contradiction proves the result.

We illustrate this Proposition with a numerical example. Consider the two

correlation matrices A and B where

A =

1 −0.04 −0.23 −0.41

−0.04 1 0.8 0.38

−0.23 0.8 1 0.23

−0.41 0.38 0.23 1

,

29

B =

1.00 0.38 −0.23 −0.41

0.38 1.00 0.80 −0.04

−0.23 0.80 1.00 0.23

−0.41 −0.04 0.23 1.00

.

They both have the same correlation elements and differ only in the position

of these elements. The dominant eigenvector of A,

[−0.31, 0.60, 0.60, 0.45]′

contains one negative component.

On the other hand, the dominant eigenvector of B,

[0.06, 0.70, 0.70, .13]′,

is strictly positive.

Hence matrix A belongs to Class III while matrix B with exactly the some

correlation elements belongs to Class II. The key difference is that matrix A has a

complete row of negative correlations. Matrix B does not have the complete row

property. We now turn to Proposition Two.

Proposition Two . If all the correlations are equal to ρ say, then the matrix

has a strictly positive dominant eigenvector if and only ρ > 0.

This result follows directly from Proposition One and the Perron-Frobenius The-

30

orem. Equi-correlation matrices play an important role in the shrinkage estimate

of Ledoit and Wolf(2004). These authors compute the shrinkage matrix Cshrink

as

Cshrink = δChistorical + (1 − δ)Cequicorr

where

• Chistorical is the correlation matrix estimated from the data

• Cequicorr is the corresponding correlation matrix where ρ is the average of

the correlations in Chistorical. For stock returns ρ > 0.

• δ is the weighting parameter which can be optimally estimated.

Boyle (2014) shows that Cshrink typically has a strictly positive eigenvector even

when Chistorical does not. Hence using a shrinkage estimate for the correlation

matrix provides a practical method of obtaining a strictly positive dominant eigen-

vector and hence a positively weighted market factor portfolio.

31

V. Summary

This paper has examined the conditions under which the first principal component

(or market factor portfolio) of stock returns has positive weights. The first princi-

pal component corresponds to the dominant eigenvector of the correlation matrix.

We used empirical returns over the period 1990-2013 to study this question. If all

the input correlations are positive, then the positivity of the dominant eigenvector

is guaranteed by the Perron-Frobenius theorem. However stock returns generally

contain negative correlations. If the negative correlations are relatively insignifi-

cant in terms of their number and magnitude it is likely that the matrix will have

a positive dominant eigenvector. Roughly speaking the dominant eigenvector can

still have all its components positive as long as the negative elements are not too

pervasive.

Our results show that short positions in the market factor portfolio become

more likely as the frequency and the absolute magnitude of negative correlations

increase. The location of the negative correlations is important as well. We saw

that negative correlations tend to occur in rows. Certain stocks, such as mining

stocks tend to be negatively correlated with several other stocks. Proposition One

shows that in the limit when a stock has all its correlations negative, the dominant

eigenvector will have a negative component. Our empirical findings are consistent

with this result. In summary whether or not the first principal components of stock

returns has any short positions depends on the number, magnitude and position

of the negative elements in the correlation matrix.

32

This paper has presented results based on daily returns. In results not re-

ported here we investigated the effects of changing the return sampling frequency

from daily to weekly. We find that there is a decrease in the number of negative

correlations in the data. The heat maps for correlations based on weekly returns

show that negative correlations still tend to cluster in rows and that the main

conclusions of the paper still hold.

33

References

[1] Alexander C., Dimitriu A., “Sources of Outperformance in Equity Mar-

kets”,The Journal of Portfolio Management, 30, 4 (2004), 170-185.

[2] Allez R., and Bouchaud J.P., “Individual and Collective Stock Dynamics:

Intra-day Seasonalities”, New J. Phys., 13 (2011), 025010.

[3] Avellaneda M., and Lee J.H., “Statistical Arbitrage in the US Equities Mar-

ket”. Quantitative Finance, 10 (2010), 1-22.

[4] Billio M., Getmansky M., Lo A. W., and Pelizzon L., “Econometric Masures

of Systemic Risk in the Finance and Insurance Sectors”,National Bureau of

Economic Research, (2010), Working Paper 16223.

[5] Boyle P., “Positive Weights on the Efficient Frontier”, North American Actu-

arial Journal, 18, 4 (2014), 462-477.

[6] Connor G., and Korakjczyk R.A., “A Test for the Number of Factors in an

Approximate Factor Model” Journal of Finance, 48, 4 (1993), 1263-1291.

[7] Frobenius, G. F. 1912. “Uber Matrizen aus Nicht Negativen Elementen” S.-B.

Preuss Acad. Wiss. (Berlin), (1912), 456-477.

[8] Geweke J., and Zhou G., “Measuring the Pricing Error of the Arbitrage Pricing

Theory”, Review of Financial Studies, 9 (1996), 557-587.

[9] Gopikrishnan P., Rosenow B., Plerou V., and Stanley H.E., “Quantifying and

Interpreting Collective Behavior in Financial Market”, Physical Review E, 64

(2001), 035106.

34

[10] Jones C.S., “Extracting Factors from Heteroskedastic Asset Returns”, Journal

of Financial Economics, 62 (2001), 293-325.

[11] Kritzman M., Li Y., Page S., and Rigobon R., “Principal Components as a

Measure of Systemic Risk”,The Journal of Portfolio Management, 37, 4 (2011),

112-126.

[12] Laloux L., Cizeau P., and Bouchaud J-.P., “Noise Dressing of Financial Cor-

relation Matrices”, Physical Review Letters, 83,7 (1999), 1467-1469.

[13] Merville L. J., and Xu Y., “The Changing Factor Structure of Equity Re-

turns”. The Journal of Portfolio Management, Summer (2001), 51-61.

[14] Noutsos D., “On Perron-Frobenius Properties of Matrices Having Some Neg-

ative Entries”,Linear Algebra and Its Applications, 412 (2006), 132-153.

[15] Pelger M., “Large-Dimensional Factor Modeling Based on High-Frequency

Observations”, (2015), Working Paper, University of California, Berkeley.

[16] Perron O., “Zur Theorie der Matrizen”, Mathematische Annalen, 64 (1907),

248-263.

[17] Ross S.A., “The Arbitrage Theory of Capital Asset Pricing” , Journal of

Economic Theory, 13, 3 (1976), 341-360.

[18] Sharpe W. F., “A Simplified Model for Portfolio Analysis”, Management Sci-

ence, 9, 2 (1963), 277-293.

[19] Tarazaga P., Raydon M., and Hurman A., “Perron-Frobenius Theorem for

Matrices With Some Negative Entries”, Linear Algebra and its Applications,

328 (2001), 57-68.

35

[20] Trzcinka C., “On the Number of Factors in the Arbitrage Pricing Model”,

Journal of Finance, 41, 2 (1986), 347-368.

36

A.

This Appendix compares the empirical distribution of the negative elements of

Class II and Class III matrices for the five subperiods.

Figure 7: Empirical distribution of negative correlations among ClassII and Class III matrices for the 1990-1993 period

size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0

frequ

ency

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2Empirical distribution of negative correlations (1990-1993)

Class IIIClass II

37


size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0

frequency

0

0.05

0.1

0.15

0.2


Class IIIClass II


size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0

frequency

0

0.05

0.1

0.15

0.2


Class IIIClass II

38


size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0

frequency

0

0.05

0.1

0.15

0.2


Class IIIClass II


size-0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0

frequency

0

0.05

0.1

0.15

0.2

0.25


Class IIIClass II

39

B.

This Appendix extends the analysis of the pattern of the negative elements in the

correlation matrices in Class II and Class III matrices to the other subperiods.

In Section III we analyzed the first (1990-1993) and last (2009-2013) subperiods.

The results for 1994-1998 and 1999-2003 are very similar to those we obtained in

Section III for 1990-1993. This can be confirmed by comparing Table 5 in Section

III to Table 12 and Table 13 below. The results for 2004-2009 are very similar to

those we obtained in Section III for 2009-2013. This can be verified by comparing

Table 8 in Section III to Table 14 below.

Subperiod 2: 1994− 1998

Company name


Rank

COLONIAL COMMERCIAL CORP (MNC) 609 1HELMSTAR GROUP INC (MNC) 541 2

LUXTEC CORP (MNC) 527 3CEDAR INCOME FUND LTD 514 4

BIO REFERENCE LABORATORIES INC 513 5POWELL INDUSTRIES INC 467 6

HEALTHCARE IMAGING SERVICES INC 443 7SAHARA GAMING CORP 398 8

HEIST C H CORP 393 9CARETENDERS HEALTH CORP 379 10


40

Subperiod 3: 1999− 2003

Company name


Rank

ROYAL GOLD INC (MNC) 1019 1NEWMONT MINING CORP (MNC) 914 2MORGANS FOODS INC (MNC) 645 3

DARLING INTERNATIONAL INC (MNC) 610 4MILESTONE SCIENTIFIC INC (MNC) 599 5FORT DEARBORN INCOME SECS INC 543 6

SENECA FOODS CORP NEW 506 7DELPHI INFORMATION SYSTEMS INC 489 8

INTEVAC INC 481 9AMERICAN SCIENCE & ENGR INC 465 10


Subperiod 4: 2004− 2008

Company name


Rank

FLANIGANS ENTERPRISES INC (MNC) 1031 1MOVIE STAR INC N Y (MNC) 910 2

CONVERA CORP 321 3CANTERBURY PARK HOLDING CORP 192 4

STEPHAN CO 191 5SUNLINK HEALTH SYSTEMS INC 176 6WINLAND ELECTRONICS INC 129 7

CORE MOLDING TECHNOLOGIES INC 66 8HEALTHSTREAM INC 59 9

SENECA FOODS CORP NEW 54 10


41

Class II; 1994-1998 (counts: 8440)


Rank in theuniverse

F M C CORP 475 903ON ASSIGNMENT INC 470 222COORS ADOLPH CO 467 103

COMMERCE BANCSHARES INC 459 570CLEAN HARBORS INC 458 34AVON PRODUCTS INC 458 738TYSON FOODS INC 458 381NATIONAL HEALTHLABORATORIES INC

456 229

VARIAN ASSOCIATES INC 456 941ALLTRISTA CORP 456 350



Rank in theuniverse

COLONIAL COMMERCIAL CORP 442 1 (MNC)HELMSTAR GROUP INC 342 2 (MNC)

BIO REFERENCE LABORATORIESINC

316 5

LUXTEC CORP 315 3 (MNC)POWELL INDUSTRIES INC 186 6CEDAR INCOME FUND LTD 169 4

DELTA AIR LINES INC 100 1014MICRON TECHNOLOGY INC 100 408

QUIKSILVER INC 100 881BERKLEY W R CORP 99 373


42

Class II; 1999-2003 (counts: 8117)


Rank in theuniverse

SUPERIOR INDUSTRIES INTL INC 404 1086MERCURY COMPUTER SYSTEMS 400 185BIO RAD LABORATORIES INC 399 803

P E C O ENERGY CO 398 85SILGAN HOLDINGS INC 398 179S V I HOLDINGS INC 397 35SCANSOURCE INC 396 1062

CABLEVISION SYSTEMS CORP 395 1096PARKWAY PROPERTIES INC 394 726

INTERFACE INC 393 921



Rank in theuniverse

NEWMONT MINING CORP 445 2 (MNC)ROYAL GOLD INC 422 1 (MNC)

MORGANS FOODS INC 358 3 (MNC)DARLING INTERNATIONAL INC 304 4 (MNC)MILESTONE SCIENTIFIC INC 259 5 (MNC)

FORT DEARBORN INCOME SECSINC

160 6

AMERICAN SCIENCE & ENGR INC 148 10DELPHI INFORMATION SYSTEMS

INC116 8

SENECA FOODS CORP NEW 112 7INTEVAC INC 111 9


43

Class II; 2004-2008 (counts: 3368)


Rank in theuniverse

CANTERBURY PARK HOLDINGCORP

371 4

CONVERA CORP 355 3WINLAND ELECTRONICS INC 340 7

HEALTHSTREAM INC 337 9MAXXAM INC 328 11STEPHAN CO 324 5

SUNLINK HEALTH SYSTEMS INC 322 6CORE MOLDING TECHNOLOGIES

INC319 8

SENECA FOODS CORP NEW 316 10CORTEX PHARMACEUTICALS

INC315 13



Rank in theuniverse

MOVIE STAR INC N Y 417 2 (MNC)FLANIGANS ENTERPRISES INC 383 1 (MNC)MOLINA HEALTHCARE INC 50 49

AUTONATION INC DEL 46 853P C TEL INC 45 1012

CAL MAINE FOODS INC 44 299BLOCK H & R INC 44 367REX STORES CORP 44 68

JARDEN CORP 44 894CLARCOR INC 43 678


44

Documents

When do portfolios based on the rst principal component ... · The market portfolio corresponds to the value weighted portfolio where the weight of each stock is proportional to its