51
DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND An Application of Correlation Clustering to Portfolio Diversification Hannah Cheng Juan Zhan 1 , William Rea 2 , Alethea Rea 3 WORKING PAPER No.11/2014 ISSN 1179-3228 Department of Economics and Finance College of Business and Economics University of Canterbury Private Bag 4800, Christchurch New Zealand

An Application of Correlation Clustering to Portfolio Diversification

Embed Size (px)

DESCRIPTION

This paper presents a novel application of software developed for constructing aphylogenetic network to the correlation matrix for 126 stocks listed onthe Shanghai A Stock Market. We show that by visualizing the correlation matrix using aNeighbor-Net network and using the circular ordering produced during the construction of thenetwork we can reduce the risk of a diversified portfolio compared with random or industrygroup based selection methods in times of market increase.

Citation preview

Page 1: An Application of Correlation Clustering to Portfolio Diversification

DEPARTMENT OF ECONOMICS AND FINANCE

COLLEGE OF BUSINESS AND ECONOMICS

UNIVERSITY OF CANTERBURY

CHRISTCHURCH, NEW ZEALAND

An Application of Correlation Clustering to Portfolio Diversification

Hannah Cheng Juan Zhan1, William Rea

2, Alethea Rea

3

WORKING PAPER

No.11/2014

ISSN 1179-3228 Department of Economics and Finance

College of Business and Economics University of Canterbury

Private Bag 4800, Christchurch New Zealand

Page 2: An Application of Correlation Clustering to Portfolio Diversification

An Application of Correlation Clustering to Portfolio Diversification

Hannah Cheng Juan Zhan1, William Rea

2, Alethea Rea

3

5/May/2014

Abstract: This paper presents a novel application of software developed for constructing a

phylogenetic network to the correlation matrix for 126 stocks listed on

the Shanghai A Stock Market. We show that by visualizing the correlation matrix using a

Neighbor-Net network and using the circular ordering produced during the construction of the

network we can reduce the risk of a diversified portfolio compared with random or industry

group based selection methods in times of market increase.

Keywords: Visualization, Neighbour-Nets, Correlation Matrix, Diversification

JEL Classifications: G11

Acknowledgements:

1. Department of Economics and Finance, University of Canterbury, Christchurch, New

Zealand

2. Department of Economics and Finance, University of Canterbury, Christchurch, New

Zealand

3. Data Analysis Australia, Perth

*Corresponding Author: [email protected],

Page 3: An Application of Correlation Clustering to Portfolio Diversification

University of Canterbury WORKING PAPERNo. 11/2014

An Application of Correlation Clustering toPortfolio Diversification

Hannah Cheng Juan Zhan, William Rea, and Alethea Rea

1 Introduction

Portfolio diversification is critical for risk management because it aims to reducethe variance in returns compared with a portfolio of a single stock or similarly un-diversified portfolio. The academic literature on diversification is vast, stretchingback at least as far as Lowenfeld (1909). The modern science of diversificationis usually traced to Markowtiz (1952) which is expanded upon in great detail inMarkowitz (1991).

The literature covers a wide range of approaches to portfolio diversification, suchas; the number of stocks required to form a well diversified portfolio, which hasincreased from eight stocks in the late 1960’s (Evans and Archer, 1968) to over100 stocks in the late 2000’s (Domian et al., 2007), what types of risks should beconsidered, (Cont, 2001; Goyal and Santa-Clara, 2003; Bali et al., 2005), factorsintrinsic to each stock (Fama and French, 1992; French and Fama, 1993), theage of the investor, (Benzoni et al., 2007), whether international diversificationis beneficial, (Jorion, 1985; Bai and Green, 2010), among others.

Despite the recommendation of authorites like Domian et al. (2007), Barber andOdean (2008) reported that in a large sample of American private investors theaverage portfolio size of individual stocks was only 4.3. While comparable datadoes not appear to be available for private Chinese investors, it seems unlikelythat they hold substantially larger portfolios.

The mean returns and variances of the individual contributing stocks are insuffi-cient for making informed decision on selecting a suite of stocks because selecting

1

Page 4: An Application of Correlation Clustering to Portfolio Diversification

a portfolio requires an understanding of the correlations between each of thestocks available for consideration for inclusion in the portfolio. The number ofcorrelations between stocks rises in proportion to the square of the number ofstocks meaning that for all but the smallest of stock markets the very large num-ber of correlations were beyond the human ability to comprehend them. Reaand Rea (2014) presented a method to visualise the correlation matrix, yieldinginsights into the relationships between the stocks.

Traditional investing wisdom has suggested that investors should select invest-ment opportunities from a range of industries because returns within an industrywould be more highly correlated than those between industries. While that mayhold true there are some instances (such as companies with operations in severalindustries) in which a stock exchange industry classification alone is insufficient.Furthermore with some authors (including Domian et al. (2007)) recommendingover 100 investments, the number of investments may exceed the number of in-dustries meaning there is a need to select a diverse range of stocks even withinindustries.

Another key aspect of stock correlation is the potential change in the correlationswith a significant change in market conditions (say comparing times of generalmarket increase with recession and post-recession periods).

In this paper we explore investment opportunities in China using data from theShanghai Stock Exchange. We compare the correlation structure reported in fourperiods (a period of market calm 2005/2006, a boom period of 2006/2007, marketdecline (2008), and a post crash period 2009/2010).

Our primary motivation is to investigate four portfolio selection strategies. Thefour strategies are;

1. picking stocks at random,

2. forming portfolios by picking stocks from different industry groups,

3. forming portfolios by picking stocks from different correlation clusters

4. forming portfolios by picking stocks from industry groups within correlationclusters.

Our results show that knowledge of correlations clusters can reduce the portfoliorisk.

The outline of this paper is as follows; Section (2) discusses the data, Section (3)the methods used in this paper, Section (4) discusses identifying the correlationclusters, Section (5) discusses the movement of stocks in the Neighbor-Net splits

2

Page 5: An Application of Correlation Clustering to Portfolio Diversification

graphs between study periods, Section (6) applies the results of the previous twosections to the problem of forming a diversified portfolio of stocks, and Section(7) contains the discussion and our conclusions.

2 Data

The data used in this study was downloaded from Datasteam. We obtained dailyclosing prices and dividend data for 126 stocks from the Shanghai A Index. Thedata listed the stock name, a six digit identification number, and assigned thestock to one of five industry groups. These groups were (1) Energy (12 stocks),(2) Finance (17 stocks), (3) Health Care (18 stocks), (4) Industrial (33 stocks),and (5) Materials (36 stocks). To make the identification of the stocks and theirexchange-assigned industry groups simpler we generated four letter stock codesand to this code appended a single letter indicating its industry group. A listof these can be found in Table (5) in A. To estimate stock return correlationswe calculated weekly returns from the daily price and dividend data. To obtainthe period returns we calculated the total return for each period and treated thedividends as being reinvested into the stock that issued them.

A graph of the index and the boundaries of our study periods can be found inFigure (1). We defined the study periods so that they represented as differentmarket conditons as we could make them, though it could be argued that ourstudy periods one and four are similar.

Study period one was 13 May 2005 until 13 June 2006 and was a period in whichthe market underwent a slow rise. Study period two was 13 June 2006 until 16October 2007 and is a cconsidered a boom or market bubble period. Study periodthe was 16 October 2007 until 28 October 2008 representing a sharp decline orcrash. The final study period was from 29 October 2008 until 19 October 2010was a time of initial market recovery and then a largely flat returns.

With four study periods, for the portfolio selection methods which require a modelbuilding, or estimatation, period we can form models in periods one through threeand use the periods two through four for out-of-sample testing. Such extremelydifferent market conditions represent a very severe test of portfolio diversificationstrategies, especially forming portfolios based on period two and testing themagainst period three data.

3

Page 6: An Application of Correlation Clustering to Portfolio Diversification

2005 2006 2007 2008 2009 2010 2011

010

0020

0030

0040

0050

0060

00

Shanghai A Index

Date

Inde

x V

alue

1 1 2 2 3 3 4 4

Figure 1: A plot of the Shanghai Stock Exchange A Index with the boundariesof the four study periods marked. The dates are 13-May-2005, 13-June-2006,16-Oct-2007, 28-Oct-2008, and 19-Oct-2010 respectively.

3 Methods

3.1 Neighbor-Net Splits Graphs

A typical stock market correlation matrix for n stocks is of full rank which meansthat it can only be represented fully in an (n− 1)-dimensional space. Some basicstatistics on the correlations are presented in Table (1). In visualization, the highdimensional data space is collapsed to a much lower dimensional space so thatthe data can be represented on 2-dimensional surface such as a page or computerscreen.

We need to convert the numerical values in the correlation matrix to a measurewhich can be construed to be a distance. In the literature the most common way

4

Page 7: An Application of Correlation Clustering to Portfolio Diversification

to do the conversion is by using the so-called ultra-metric,

dij =√

2(1 − ρij) (1)

where dij is the estimated distance and ρij is the estimated correlation betweenstocks i and j, see Mantegna (1999) for details.

Using the conversion in Equation (1) we formatted the converted correlationmatrix and augmented it with the appropriate stock codes for reading into theNeighbor-Net software, SplitsTree, available from http://www.splitstree.org.Using the SplitsTree software we generated the Neighbor-Nets splits graphs. Be-cause the splits graphs are intended to be used for visualization we defer thediscussion of the identification of correlation clusters and their uses to Sections(4) and (5) below.

3.2 Simulated Portfolios

Recently Lee (2011) discussed so-called risk-based asset allocation. In contrastto strategies which require both expected risk and expected returns for eachinvestment opportunity as inputs to the portfolio selection process, risk-basedallocation considers only expected risk. The four methods of portfolio selectionwe present below can be considered to be risk-based allocation methods. Thisprobably reflects private investor behaviour in that often they have nothing morethan broker buy, hold, or sell recommendations to assess likely returns.

The four portfolio methods were compared using simulations. For each of 1,000iterations a portfolio was sampled based on the rules governing the portfoliotype. We recorded at the mean and standard deviation of the returns for the1,000 portfolios.

As mentioned in the introduction the primary motivation is to investigate fourportfolio strategies. These are:

1. Selecting stocks at random;

2. Selecting stocks based on industry groupings;

3. Selecting stocks based on correlation clusters; and

4. Selecting stocks based on industry groups within correlation clusters.

We describe each of these in turn.

5

Page 8: An Application of Correlation Clustering to Portfolio Diversification

Random Selection: The stocks were selected at random using a uniform dis-tribution with out replacement. In other words each stock was given equalchance of being selected according but with no stock being selected twicewithin a single portfolio.

By Industry Groups: There were five industry groups. If the portfolio sizewas five or less, the industries were chosen at random using a uniformdistribution without replacement. From each of the selected industry groupsone stock was selected. If the desired portfolio size was more than five theneach group had at least s stocks selected, where s is the quotient of theportfolio size divided by five. Some (the remainder of the portfolio sizedivided by five) industry groups will have s + 1 stocks selected and theindustry groups this applied to were chosen using a uniform distributionwithout replacement. Within each industry group stocks were selected usinga uniform distribution, again without replacement.

By Correlation Clusters: The correlation clusters were determined by exam-ining the Neighbor-Net network for the relevant periods (period one, twoand three). Each stock was assigned to exactly one cluster and each clustercan be defined by a single split (or bipartition) of the circular ordering ofthe Neighbor-Net of the relevant period. The clusters determined in periodsone, two and three were used to generate the portfolios for out-of-sampletesting in periods two, thee and four respectively. Because the goal of port-folio building is to reduce risk each cluster was paired with another clusterwhich was considered most distant from it. This method is discussed indetail below.

As with the industry groups, if there were fewer clusters than the desiredportfolio size, cluster pairs were selected at random and a stock selectedfrom within each correlation cluster pair. If the desired portfolio size waslarger than the number of correlation cluster then we apply the methoddescribed above for the industry clusters.

As indicated above each cluster was paired with the one most distant fromit. Because we identified an even number of clusters in period two, clusterone was paired with cluster five, two with six and so on. In periods withan odd number of clusters the pairing may not be so straight-forward. Forexample, in period two (see Figure 15) we identified five clusters and clusterone was paired with four, both clusters two and three were paired with five,four was paired with one and five with two.

By Industry Group within Correlation Clusters: The final method wasselecting stocks stocks from industry groups within correlation clusters.Each stock within each cluster has an associated industry group. Therefore

6

Page 9: An Application of Correlation Clustering to Portfolio Diversification

each correlation cluster can be subdivided into up to five sub-clusters basedon industry.

As indicated above each cluster was paired with the one most distant fromit. Once a cluster was selected for inclusion, so was the paired cluster,however this time we did not allow any of the paired stocks to be from thesame industry. This was the method used for determining the set of stocksfor the fourth portfolio strategy.

4 Identifying Correlation Clusters

As Bryant and Moulton (2004) point out “the splits graphs generated by Neighbor-Net are always planar, an important advantage over other network methods whenit comes to visualization” (emphasis original). Thus one method of identifyinga group of stocks clustered by correlation is to examine the splits graph for thestocks (see, for example, Figure 2) and look for natural breaks in the structureof the network. Because this is a visual approach the results are subjective andresult from the researcher or financial analyst balancing whatever competing re-quirements they may have. It is possible for two researchers to interpret the splitsgraph in different ways and identify different clusters. Here we know that in thesimulations to follow the sizes of the portfolios we will generate will be two, four,eight or 16 stocks. Consequently we do not need large numbers of clusters andwe would like them to have a sufficiently large number of stocks that when select-ing stocks at random from within the cluster that there are a sufficiently largenumber of combinations available to make the simulations meaningful. Theserequirements are, as indicated above, subjective. For period one we chose eightclusters, which was the maximum number of clusters in any period. The smallestcluster had nine stocks giving

(92

)= 36 distinct ways of choosing two stocks from

this cluster in the 16 stock portfolio simulation.

Figure (2) shows the clusters we identified for period one. The stocks in eachcluster are listed in Appendix B, Section (B.1). Cluster one is at the bottomin black and the clusters are sequentially numbered moving counter-clockwisearound the splits graph. Cluster one can be recognised by the small, but clear,gaps in the network structure between it and clusters two and eight. Similarsmall gaps can be seen between the other clusters.

This grouping of eight clusters is not the only division of the stocks into clusterswhich could have been made. If the researcher or financial analyst had otherrequirements some of the clusters could be further subdivided or combined. Forexample if small clusters were acceptible then Cluster 2 could be further split intotwo clusters, as could Cluster 8. In both cases there is a clear gap in the network

7

Page 10: An Application of Correlation Clustering to Portfolio Diversification

structure where the split could be made. Conversely, if the number of clustersdesired was reduced then there are some reasonably clear combinations whichcould be made. For example, if only two clusters were required, then, perhaps,Clusters 1, 2, 7, and 8 could be combined to form one cluster while Clusters 3,4, 5, and 6 would form the other.

5 Movements of Stocks in the Splits Graphs be-

tween Periods

In Figures (3) through (8) we show the movement of industry groups both withina cluster and generally between study periods. We compare this with the move-ments of the materials industry group in the splits graph.

In Figure (3) we have selected Cluster 1 in study period 1 and assigned a colourto each industry group within the cluster. While all five industry groups arerepresented in the cluster it is clear that the materials group of stocks representthe largest such group within this correlation cluster. Figures (4) through (6)shows locations in the splits graph of the stocks from Cluster 1 of Period 1 inPeriods 2 through 4. As can be seen the stocks in this initial cluster do notremain clustered together in subsequent periods.

However, the materials group has remained together as a block not only in studyperiod two but also in study periods three and four. During period two (Figure4) the materials group from Cluster 1 is now in what we identified as Cluster3. In study period three (Figure 5) they have split into two groups and are inwhat we identified as Clusters 1 and 6, which are adjacent clusters in that studyperiod. Finally in study period four they are in what we identified as Clusters 1and 2, again, these are adjacent clusters in that study period.

In diversification one seeks groups of stocks which will tend to move togetherin the future but relatively independently of other so-identified groups of stocks.Then an investors spreads their investments across these groups. This is the basisfor previous studies which have grouped stocks my industry assuming that stocksin the same industry will tend to have price movements more similar than stocksin different industries, see Section (5.1) below. Thus the evidence presented hereis that the stocks within cluster one period one from the materials group form afinancially useful grouping when considering forming a diversified portfolio.

Because of this we would not expect portfolios selected from stocks within corre-lation clusters alone to be significantly less risky than those chosen from industrygroups. However, considering both a stock’s industry group and its correlation

8

Page 11: An Application of Correlation Clustering to Portfolio Diversification

cluster has potential to result in greater risk reduction than either method on itsown.

5.1 Clustering by Industry Group

In previous studies a number of authors have included in their studies of formingdiversified stock portfolios at least one method in which they dividied the stocksinto industry groups and then selected portfolios by spreading the investmentsacross the groups, see Domian et al. (2007) for example. Neighbor-Nets splitsgraphs give us a direct method of assessing the likely success of such a strategy.To illustrate this we have selected the energy and materials groups because theyhad the smallest and largest number of stocks, 12 and 36 respectively. Figures(7) through (10) show the locations of the materials stocks, while Figures (11)through (14) show the locations of the energy stocks.

Clustering of the materials stocks is clearly visible in each of the four studyperiods. This gives a direct visual confirmation of previous studies which havereported that selecting stocks by spreading them across industry groups gives agreater reduction in portfolio risk than randomly selecting stocks. In the smallerenergy sector stocks this clustering is even clearer.

6 Example

This examples uses 126 stocks from the Shanghai exchange, for which we calcu-lated the weekly returns from price and dividend data and we divided the datainto four periods based on market behaviour as discussed in Section (2) above.Some basic statistics on the correlations are presented in Table (1). As can beseen the highest average correlation occurred in period 3, a time of a sharp marketdecline or crash.

For all the periods, as the portfolio size was increased the standard deviationof the returns decreased across all four portfolio selection methods. In earlyempirical studies of portfolio diversification focused on the number of stocks ina portfolio, see Evans and Archer (1968). A larger portfolio was reported to beless risky with the lower risk being a result of the lower level of variation in thereturns. However, the benefit of reduced risk rapidly dimished with increasingportfolio size.

An ANOVA test was used to compare the means, because the variances werewithin a small range the ANOVA test remains valid even though the Levene test

9

Page 12: An Application of Correlation Clustering to Portfolio Diversification

detects statistically significant differences. The Levene test was applied using thelawstat package in R (Gastwirth et al., 2013).

Period Mean Std. Dev. Min Max Negative1 0.266 0.170 -0.642 0.864 438/78752 0.328 0.196 -0.413 0.855 480/78753 0.441 0.191 -0.168 0.908 132/78754 0.437 0.192 -0.158 0.906 143/7875

Table 1: Basic statistics on the correlations. There are n(n − 1)/2 = (126 ×125)/2 = 7875 correlations between the 126 stocks. The final column gives thecount of the number of correlations which were estimated to be negative. Thehighest proportion of negative correlations occurred in period 2 when approxi-mately 6% of estimated correlations were negative.

Number of Industry and ANOVAStocks Random Industry Correlation Correlation (Levene) Test

in Portfolios Selection Grouping Clusters Clusters p-value2 464 449 467 457 0.0783

(234) (227) (220) (2.8) (0.281)4 468 459 463 4.71 0.248

(169) (161) (154) (158) (0.041)8 466 459 454 4.64 0.484

(119) (115) (102) (105) (<0.001)16 466 462 463 466 0.023

(78) (78) (68) (50) (<0.001)

Table 2: Returns in percent under the four different portfolio selection meth-ods for period two using period one data for the estimation of the correlations.Underneath each set of returns, in brackets, is the standard deviation of the re-turns. The final column reports the p-value of the ANOVA analysis which testsfor differences in the means or the Levene test which tests whether the standarddeviations of all four methods are equal as appropriate for each line.

Period two was a period of general market increase and the returns were goodduring this period. Table (2) presents the mean and standard deviations ofreturns together with some statistical testing of the results. The returns werestatistically significantly different for portfolios of size 16 and weakly significantfor portfolios of size 2. For the smallest portfolios the correlation cluster methodperformed best and for portfolios of size 4 and 16 the industry and correlationclusters method performed best.

For the all the portfolios the variation in the returns decreased as the portfoliosize increased. The Levene test showed that there was statistically significant

10

Page 13: An Application of Correlation Clustering to Portfolio Diversification

differences in the standard deviations for portfolios of size 4, 8 and 16. For port-folios of size 4 and 8 the correlation cluster method produced the lowest variationin the returns. For portfolios of size 16 it was the industry and correlation clustermethod that produces the lowest variation, by a substantial margin.

Number of Industry and ANOVAStocks Random Industry Correlation Correlation (Levene) Test

in Portfolios Selection Grouping Clusters Clusters p-value2 -57 -54 -52 -53 0.007

(25) (27) (29) (27) (0.265)4 -58 -56 -53 -54 0.001

(0.16) (17) (19) (18) (0.001)8 -57 -55 -53 -54 <0.001

(0.11) (12) (14) (13) (0.004)16 -57 -54 -55 -54 <0.001

(8) (8) (8) (7) (<0.001)

Table 3: Returns in percent under the four different portfolio selection methodsfor period three using period two data for the estimation of the correlations.Underneath each set of returns, in brackets, is the standard deviation of thereturns. The final column reports the p-value of the ANOVA analysis whichtests for differences in the means or the Levene test which tests whether thestandard deviations of all four methods are equal as appropriate.

Table (3) presents the mean andstandard deviations of returns together with somestatistical testing of the results for period three. This was a period of generalmarket decline. In these circumstances a widely used risk/return measure suchas the Sharpe ratio is negative. In such circumstances a private investor wouldregard a portfolio which minimised the losses as be the most desirable. While weshould not over interpret the results, the correlation clusters have slightly betterreturns for portfolios of sizes 2, 4 and 8. The industry and correlation clustersand industry based groupings have slightly better returns for portfolios of size16.

As with period two out of sample testing, the variation decreased as the portfoliosize was increased, regardless of the method used to select the portfolio. TheLevene test showed that there was statistically significant differences in the vari-ances in the standard deviations for portfolios of size 4, 8, and 16. Typically thecorrelation cluster method showed the largest standard deviations and randomselection method the lowest standard deviations. For portfolios of size 16 theindustry and correlation clusters method reported the smallest variation.

Table (4) presents the mean andstandard deviations of returns together withsome statistical testing of the results for period four. This period showed modest

11

Page 14: An Application of Correlation Clustering to Portfolio Diversification

Number of Industry and ANOVAStocks Random Industry Correlation Correlation (Levene) test

in Portfolios Selection Grouping Clusters Clusters p-value2 2.2 211 241 237 <0.001

(154) (164) (173) (166) (0.227)4 229 200 235 233 <0.001

(118) (105) (113) (118) (<0.001)8 218 210 233 234 <0.001

(75) (74) (82) (84) (0.003)16 219 207 232 234 <0.001

(53) (50) (53) (41) (<0.001)

Table 4: Returns in percent under the four different portfolio selection methodsfor period four using period three data for the estimation of the correlations.Underneath each set of returns, in brackets, is the standard deviation of thereturns. The final column reports the p-value of the ANOVA analysis whichtests for differences in the means or the Levene test which tests whether thestandard deviations of all four methods are equal as appropriate.

returns. While, again, we should not over-interpret the results, the returns werelower for random and industry grouping selection methods for all four portfoliossizes tested. The highest returns were for the correlation clusters portfolio selec-tion method for the two smaller portfolios, and for portfolios of size 8 and 16 theindustry and correlation clusters method reports slightly higher returns.

As with period two and three out of sample testing, the variation decreased as theportfolio size was increased, regardless of the method used to select the portfolio.The Levene test showed that there was statistically significant differences in thevariances for the portfolios of sizes 4, 8 and 16. The industry based selectionmethod offered the greatest reduction in the variation in the returns for portfoliosof size 4 and 8. For the largest portfolio size (portfolios of size 16) the industryand correlation clusters had the lowest standard deviations (the same outcomeas periods two and three).

Therefore this suggests that the correlation clusters (or industry and correlationclusters) are particularly effective in times of general market increase, with thebenefit being either a reduction in the variation or an increase in the return.

This study shows that combining industry and correlation clusters is particularlyeffective at lowering the variation for the larger portfolios, with all three periodshowing a much lower variation for portfolios of size 16, as well as reasonablereturns. This is in line with general advice to investors to hold larger portfoliosand to ensure the holdings are diversified.

12

Page 15: An Application of Correlation Clustering to Portfolio Diversification

7 Discussion

An earlier paper (Rea and Rea, 2014) introduced Neighbor-Net networks as amethod for visualising correlations in stock markets. The method has the ad-vantage of being able to represent a lot of the key features of the correlationmatrix in a planar graphic. The paper noted that such a diagram could assistwith creating diversified portfolios. This paper has highlighted the effectivenessof using correlation clusters to investigate diversified portfolios.

In this paper four risk budgeting methods of portfolio selection were compared;randomly selected portfolios, industry clusters, correlation clusters and industryand correlation clusters. Traditionally selecting stocks by industry was consideredan appropriate method to diversify a portfolio. While this may be the case insome markets and under some market conditions, this investigation demonstratedthat industry based clusters was generally outperformed by portfolios selected atrandom, however the portfolios selected using industry grouping may have lowervariance in times of market increase compared with random selection.

Of the four, the most restrictive method of selecting portfolios was the industryand correlation cluster selection method. With the random selection methodall possible combinations of n stocks from the 126 stocks are allowable but forthe industry and correlation cluster selection method, there are many portfoliosthat are not admissible because they do not meet the rules of this portfolioselection method. The industry grouping and correlation cluster methods arealso restrictive but less so than the industry and correlation clusters method.

The main concern was whether the rules of portfolio selection presented here offersignificant benefits. If a difference in mean was detected, the correlation clustersor industry and correlation clusters method may outperform the other methodson mean return. This effect was most pronounced in the period four out of sampletesting where the returns for the correlations clusters and industry and correla-tion clusters method always exceeded random portfolio selection. Therefore theknowledge of the circular ordering can be used to enhance portfolio returns.

The variation in the returns for portfolios of size 16 was always lowest if themethod of portfolio selection was Industry and Correlation Cluster selection. Forthe other portfolio sizes the variation with a method decrease as the portfolio sizeincreases, but no one method consistently outperforms the others. This suggestsportfolio size has a greater impact on the variation of the returns than the methodused to select of the portfolio.

Rea and Rea (2014) discussed how stocks from opposite side of the Neighbor-Netnetwork did not necessarily create a portfolio with high returns because some

13

Page 16: An Application of Correlation Clustering to Portfolio Diversification

stocks maybe giving negative returns while one on the opposite side of the net-work may be giving positive returns. The division of the data into four periodsin the manner we did, represents a particularly severe test of diversifiction, par-ticularly since no account was taken of either historical or expected returns ofthe stocks. It is our expectation that investor knowledge and analysis alongsidecorrelation cluster based portfolio selection has the potential to improve the re-turn of the portfolio, as well as reduce the variance (or equivalently, the standarddeviation).But this awaits further research.

We note that the correlation clusters were determined by eye in this analysis.This is a valid method of determining clusters, however it is also subjective. Itdoes offer the advantage of being able to assign each stock to exactly one cluster.Future work could focus on methods to automate the selection of the correlationclusters to see if this further enhances the portfolio performance.

14

Page 17: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

SHZJ_F

KMPH_H

BJSL_H

HLBE_H

CHRW_I

BJTR_HCSSC_I

ADTM_MTYCG_E

ZJJH_MCSGH_M

BDTW_IXAAE_I

CMSB_FTLNM_M

SDDE_H

SXCI_I

SHZH_I

SCHD_MSZLN_M

JCPR_MBJTN_M

SDNS_MXMTS_M

YTIN_MYNCP_M

OFFS_ENRTE_I

TSLP_H

JXHD_I

GXWZ_HSNCH_I

JLAD_HLNCD_I

CTSC_F

SHDT_EHNZF_M

YZCM_E

CCFM_MHNSH_M

JZER_ESXXS_E

SXLH_EYQCI_E

GZPJ_EGHEG_F

YNAL_M

BSRE_MCHSS_I

STCM_IMMTL_I

XMCD_ISHIT_I

GXLG_ISHCT_F

GJHX_M

JSZN_IIMPZ_E

YYTH_M

XXDI_M

TBEA_I

YTWH_M

QHSL_M

HBIS_M

IMBT_M

BSIS_M

ZJNH_H

PGGS_M

SHCG_IHTSC_F

CRDP_H

ZJHS_H

JSHM_HZLHS_I

HYSC_F

BJCT_F

SHFS_H

SWSC_F

CIMC_I

SNUS_F

YNBY_H

AHCC_MSHIA_I

SXTG_M

CGZB_I

AGST_MWHIS_M

LXVC_ITSJD_M

CHBA_FJLYT_M

NESC_FFCNM_M

SNLS_FHYPC_M

CAAE_IRSNM_M

HNAL_ICHEA_I

CSAL_I

NCPH_H

GYSC_FXCMG_I

SZBK_FCMBK_F

SHPD_FHXBK_F

FCSH_FCHMT_F

CHVK_FGMDL_F

CHPC_E

CJSC_FOWRG_F

HRBP_HCATL_I

TYHI_I

CRSJ_HHBYH_M

DFET_I

XAAI_I

GFSC_F

ZJMC_H

XHZB_F

BEJU_F

0.1

Figure 2: SplitsTree network for 126 stocks from the Shanghai A Stock Exchangefor period one using five trading day returns to estimate correlations and hencedistances with the stocks in cluster one colour coded. The eight correlationclusters each have different colours. In the discussion the clusters are coded asfollows; Cluster 1 – Black, Cluster 2 – Blue, Cluster 3 – Purple, Cluster 4 – Red,Cluster 5 - Khaki, Cluster 6 – Green, Cluster 7 – Aqua, Cluster 8 – Yellow.

15

Page 18: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

SHZJ_F

KMPH_H

BJSL_H

HLBE_H

CHRW_I

BJTR_HCSSC_I

ADTM_MTYCG_E

ZJJH_MCSGH_M

BDTW_IXAAE_I

CMSB_FTLNM_M

SDDE_H

SXCI_I

SHZH_I

SCHD_M

SZLN_M

JCPR_M

BJTN_M

SDNS_M

XMTS_MYTIN_M

YNCP_MOFFS_E

NRTE_ITSLP_H

JXHD_I

GXWZ_H

SNCH_IJLAD_H

LNCD_ICTSC_F

SHDT_E

HNZF_MYZCM_E

CCFM_MHNSH_M

JZER_ESXXS_E

SXLH_E

YQCI_E

GZPJ_E

GHEG_FYNAL_M

BSRE_MCHSS_I

STCM_IMMTL_I

XMCD_ISHIT_I

GXLG_ISHCT_F

GJHX_M

JSZN_IIMPZ_E

YYTH_M

XXDI_M

TBEA_I

YTWH_M

QHSL_M

HBIS_M

IMBT_M

BSIS_M

ZJNH_H

PGGS_MSHCG_I

HTSC_F

CRDP_H

ZJHS_H

JSHM_HZLHS_IHYSC_F

BJCT_FSHFS_H

SWSC_F

CIMC_I

SNUS_F

YNBY_H

AHCC_M

SHIA_ISXTG_M

CGZB_IAGST_M

WHIS_MLXVC_I

TSJD_MCHBA_F

JLYT_M

NESC_FFCNM_M

SNLS_FHYPC_M

CAAE_IRSNM_M

HNAL_ICHEA_I

CSAL_INCPH_H

GYSC_FXCMG_I

SZBK_FCMBK_F

SHPD_FHXBK_F

FCSH_FCHMT_F

CHVK_FGMDL_F

CHPC_ECJSC_F

OWRG_F

HRBP_HCATL_I

TYHI_I

CRSJ_HHBYH_M

DFET_I

XAAI_I

GFSC_F

ZJMC_H

XHZB_F

BEJU_F

0.1

Figure 3: The SplitsTree network for the Shanghai A Stock Exchange for periodone with the stocks in cluster one colour coded by industry group. The coloursare Energy - Black, Finance – Blue, Health Care – Red, Industrials – Khaki,Materials – Green.

16

Page 19: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

HBYH_M

CATL_I

TSLP_HHRBP_H

LXVC_I

SNCH_IHNAL_I

CHEA_IXCMG_ICRDP_H

BDTW_I

SXCI_IZJNH_H

ZJMC_HGZPJ_E

SHDT_EJZER_E

YQCI_ESXXS_E

SXLH_E

YZCM_E

HNSH_MHBIS_M

CSGH_M

IMBT_MGJHX_M

TSJD_MXXDI_M

SHCT_FTYCG_E

ZJJH_M

CAAE_INCPH_H

PGGS_MJSZN_I

HYPC_M

GHEG_F

JXHD_INESC_F

XAAI_IRSNM_M

GFSC_F

SWSC_FCJSC_F

SDDE_H

XAAE_IGYSC_F

SNLS_FCSSC_I

DFET_I

JLAD_HLNCD_I

HYSC_F

CTSC_FHXBK_F

CMSB_FSHPD_F

CMBK_F

SZBK_F

TBEA_I

CSAL_I

CHPC_E

WHIS_MBSIS_M

AGST_M

QHSL_MCIMC_I

SXTG_M

OFFS_EYTWH_M

YNBY_H

BJTR_H

SHIA_I

JSHM_HSHIT_I

ZLHS_I

AHCC_MGXLG_I

SHZH_IHLBE_H

STCM_IBSRE_M

BJTN_MSDNS_M

YNAL_MTLNM_M

YNCP_MJCPR_M

SZLN_M

SCHD_MYTIN_M

XMTS_M

CCFM_M

MMTL_I

YYTH_MCGZB_I

BJSL_HHNZF_M

SNUS_FHTSC_F

FCNM_M

CHRW_IXMCD_I

BJCT_F

OWRG_F

CHMT_FGMDL_F

CHVK_F

FCSH_F

XHZB_F

BEJU_F

SHZJ_FCHBA_F

CHSS_I

SHFS_HZJHS_H

GXWZ_HNRTE_I

TYHI_IIMPZ_E

CRSJ_H

KMPH_H

ADTM_M

SHCG_I

JLYT_M

0.1

Figure 4: SplitsTree network for study period two with the stocks from clusterone, period one coloured. The colours are Energy - Black, Finance – Blue, HealthCare – Red, Industrials – Khaki, Materials – Green.

17

Page 20: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

XAAE_I

YYTH_M

QHSL_M

RSNM_M

PGGS_M

CAAE_IGFSC_FHYPC_M

NCPH_H

BJTR_H

SDDE_H

ZJHS_H

HLBE_HTSLP_H

BJSL_HHRBP_H

YNBY_H

JSHM_H

CRSJ_HCRDP_H

KMPH_HSHFS_H

GXWZ_HZJMC_H

ZJNH_H

ADTM_MMMTL_I

CCFM_M

XMTS_MBSRE_M

BJTN_M

FCNM_M

HBYH_MYTWH_M

SHIA_IOFFS_E

SNCH_IIMPZ_E

SXLH_E

YQCI_ESXXS_E

YZCM_E

SHDT_EHNSH_M

TYCG_E

GZPJ_EJZER_E

SXCI_I

SCHD_MYNCP_M

TLNM_M

JCPR_M

SZLN_M

YTIN_M

SDNS_MYNAL_M

HNZF_MXXDI_M

CSSC_ICIMC_I

CHPC_E

IMBT_MWHIS_M

BSIS_MAGST_MGJHX_M

SXTG_M

HBIS_M

SZBK_F

CMSB_FCMBK_F

SHPD_F

HXBK_FCHRW_ISHZH_I

SWSC_F

HTSC_F

CJSC_FHYSC_F

GYSC_F

NESC_F

SNLS_FCTSC_F

JLAD_HLNCD_I

CHBA_F

TYHI_I

JLYT_M

XCMG_I

SHIT_IZLHS_I

GXLG_I

STCM_I

TSJD_MAHCC_M

CATL_I

SHCG_I

CGZB_ISHCT_F

SHZJ_FXMCD_I

XHZB_FJSZN_I

FCSH_FCHVK_F

OWRG_F

GMDL_FCHMT_F

BJCT_F

SNUS_FBEJU_F

HNAL_I

CSAL_ICHEA_I

LXVC_INRTE_I

TBEA_IDFET_I

BDTW_I

CSGH_M

ZJJH_MXAAI_I

JXHD_IGHEG_F

CHSS_I

0.1

Figure 5: SplitsTree network for study period three with the stocks in cluster one,period one coloured. The colours are Energy - Black, Finance – Blue, Health Care– Red, Industrials – Khaki, Materials – Green.

18

Page 21: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

XAAI_I

JXHD_IRSNM_M

CCFM_MMMTL_I

XMTS_MBSRE_M

ADTM_M

BJTN_MFCNM_M

HBYH_M

XXDI_M

HNZF_M

YNAL_MSDNS_M

YTIN_M

SZLN_M

JCPR_MTLNM_M

YNCP_M

SCHD_M

YQCI_E

YZCM_E

SHDT_E

SXLH_E

HNSH_M

TYCG_E

SXXS_EGZPJ_E

JZER_E

SXCI_I

IMPZ_E

SNCH_ISHIA_I

LNCD_IJLAD_H

CTSC_FSNLS_F

NESC_F

GYSC_FHYSC_F

CJSC_F

HTSC_F

SWSC_F

SHZH_ISZBK_F

CMSB_F

CMBK_FSHPD_F

HXBK_F

CHRW_IHBIS_M

SXTG_MGJHX_MAGST_M

BSIS_M

WHIS_MIMBT_MCHPC_E

CIMC_I

CSSC_IOFFS_E

BEJU_F

BJCT_F

CHMT_FGMDL_F

OWRG_F

CHVK_FFCSH_F

SNUS_F

JSZN_I

XHZB_FXMCD_I

SHZJ_F

SHCT_F

CGZB_ISHCG_I

XCMG_IGXLG_ISHIT_I

ZLHS_ISTCM_I

TSJD_MAHCC_M

CATL_I

JLYT_M

CHBA_F

TYHI_I

ZJJH_M

BDTW_ICSGH_M

YTWH_M

HNAL_ICSAL_I

CHEA_IDFET_I

TBEA_I

NRTE_ILXVC_I

ZJNH_HZJMC_H

GXWZ_H

SHFS_HBJSL_H

HRBP_HTSLP_H

YNBY_HJSHM_H

CRSJ_HCRDP_H

KMPH_H

HLBE_H

ZJHS_H

SDDE_H

BJTR_HNCPH_H

HYPC_MQHSL_M GFSC_F

CAAE_I

PGGS_M

YYTH_MXAAE_I

GHEG_F

CHSS_I

0.1

Figure 6: SplitsTree network for study period four with the stocks in cluster one,period one coloured. The colours are Energy - Black, Finance – Blue, HealthCare – Red, Industrials – Khaki, Materials – Green.

19

Page 22: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

SHZJ_F

KMPH_H

BJSL_H

HLBE_H

CHRW_I

BJTR_HCSSC_I

ADTM_MTYCG_E

ZJJH_MCSGH_M

BDTW_IXAAE_I

CMSB_F

TLNM_MSDDE_H

SXCI_I

SHZH_I

SCHD_M

SZLN_M

JCPR_MBJTN_M

SDNS_M

XMTS_MYTIN_M

YNCP_MOFFS_E

NRTE_I

TSLP_H

JXHD_I

GXWZ_H

SNCH_IJLAD_H

LNCD_ICTSC_F

SHDT_E

HNZF_MYZCM_E

CCFM_MHNSH_M

JZER_ESXXS_E

SXLH_EYQCI_E

GZPJ_EGHEG_F

YNAL_M

BSRE_MCHSS_I

STCM_IMMTL_I

XMCD_ISHIT_I

GXLG_ISHCT_F

GJHX_MJSZN_I

IMPZ_E

YYTH_MXXDI_M

TBEA_IYTWH_MQHSL_M

HBIS_MIMBT_MBSIS_M

ZJNH_H

PGGS_MSHCG_I

HTSC_F

CRDP_H

ZJHS_H

JSHM_HZLHS_IHYSC_F

BJCT_F

SHFS_H

SWSC_F

CIMC_I

SNUS_F

YNBY_H

AHCC_M

SHIA_I

SXTG_MCGZB_I

AGST_MWHIS_M

LXVC_I

TSJD_MCHBA_F

JLYT_M

NESC_FFCNM_M

SNLS_F

HYPC_M

CAAE_I

RSNM_M

HNAL_ICHEA_I

CSAL_INCPH_H

GYSC_FXCMG_I

SZBK_FCMBK_F

SHPD_FHXBK_F

FCSH_FCHMT_F

CHVK_FGMDL_F

CHPC_E

CJSC_FOWRG_F

HRBP_HCATL_I

TYHI_I

CRSJ_H

HBYH_MDFET_I

XAAI_I

GFSC_F

ZJMC_H

XHZB_F

BEJU_F

0.1

Figure 7: SplitsTree network for study period one with the stocks in the materialssector coloured green.

20

Page 23: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

HBYH_MCATL_I

TSLP_HHRBP_H

LXVC_I

SNCH_IHNAL_I

CHEA_IXCMG_ICRDP_H

BDTW_ISXCI_I

ZJNH_HZJMC_H

GZPJ_E

SHDT_EJZER_E

YQCI_ESXXS_E

SXLH_E

YZCM_EHNSH_M

HBIS_M

CSGH_M

IMBT_MGJHX_M

TSJD_M

XXDI_MSHCT_F

TYCG_E

ZJJH_MCAAE_I

NCPH_HPGGS_M

JSZN_IHYPC_M

GHEG_F

JXHD_I

NESC_F

XAAI_I

RSNM_M

GFSC_F

SWSC_FCJSC_F

SDDE_H

XAAE_IGYSC_FSNLS_F

CSSC_I

DFET_I

JLAD_HLNCD_I

HYSC_FCTSC_F

HXBK_FCMSB_F

SHPD_F

CMBK_F

SZBK_FTBEA_I

CSAL_ICHPC_E

WHIS_MBSIS_MAGST_M

QHSL_M

CIMC_ISXTG_M

OFFS_EYTWH_M

YNBY_H

BJTR_H

SHIA_I

JSHM_H

SHIT_I

ZLHS_I

AHCC_MGXLG_I

SHZH_I

HLBE_HSTCM_I

BSRE_MBJTN_M

SDNS_MYNAL_MTLNM_M

YNCP_M

JCPR_M

SZLN_M

SCHD_MYTIN_M

XMTS_MCCFM_M

MMTL_I

YYTH_MCGZB_I

BJSL_H

HNZF_M

SNUS_FHTSC_F

FCNM_M

CHRW_IXMCD_I

BJCT_F

OWRG_FCHMT_F

GMDL_FCHVK_F

FCSH_F

XHZB_FBEJU_F

SHZJ_FCHBA_FCHSS_I

SHFS_HZJHS_H

GXWZ_HNRTE_I

TYHI_I

IMPZ_E

CRSJ_H

KMPH_H

ADTM_MSHCG_I

JLYT_M

0.1

Figure 8: SplitsTree network for study period two with the stocks in the materialssector coloured green.

21

Page 24: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAE_I

YYTH_M

QHSL_M

RSNM_MPGGS_M

CAAE_I

GFSC_FHYPC_M

NCPH_HBJTR_H

SDDE_H

ZJHS_HHLBE_H

TSLP_HBJSL_H

HRBP_H

YNBY_H

JSHM_HCRSJ_H

CRDP_H

KMPH_HSHFS_H

GXWZ_HZJMC_H

ZJNH_H

ADTM_MMMTL_I

CCFM_MXMTS_M

BSRE_M

BJTN_M

FCNM_M

HBYH_M

YTWH_M

SHIA_IOFFS_E

SNCH_I

IMPZ_ESXLH_E

YQCI_ESXXS_E

YZCM_E

SHDT_E

HNSH_MTYCG_E

GZPJ_E

JZER_E

SXCI_I

SCHD_M

YNCP_M

TLNM_M

JCPR_M

SZLN_M

YTIN_M

SDNS_MYNAL_M

HNZF_M

XXDI_MCSSC_I

CIMC_I

CHPC_EIMBT_M

WHIS_M

BSIS_MAGST_MGJHX_M

SXTG_M

HBIS_MSZBK_F

CMSB_F

CMBK_FSHPD_F

HXBK_FCHRW_ISHZH_I

SWSC_F

HTSC_F

CJSC_FHYSC_F

GYSC_FNESC_F

SNLS_FCTSC_F

JLAD_HLNCD_I

CHBA_F

TYHI_I

JLYT_M

XCMG_I

SHIT_IZLHS_I

GXLG_ISTCM_I

TSJD_M

AHCC_MCATL_I

SHCG_ICGZB_I

SHCT_F

SHZJ_F

XMCD_IXHZB_F

JSZN_I

FCSH_FCHVK_F

OWRG_F

GMDL_FCHMT_F

BJCT_F

SNUS_F

BEJU_FHNAL_I

CSAL_ICHEA_I

LXVC_INRTE_I

TBEA_IDFET_I

BDTW_I

CSGH_M

ZJJH_MXAAI_I

JXHD_I

GHEG_FCHSS_I

0.1

Figure 9: SplitsTree network for study period three with the stocks in the mate-rials sector coloured green.

22

Page 25: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAI_I

JXHD_I

RSNM_MCCFM_M

MMTL_I

XMTS_MBSRE_M

ADTM_M

BJTN_M

FCNM_M

HBYH_M

XXDI_M

HNZF_MYNAL_M

SDNS_M

YTIN_M

SZLN_M

JCPR_M

TLNM_M

YNCP_M

SCHD_MYQCI_E

YZCM_E

SHDT_E

SXLH_E

HNSH_M

TYCG_E

SXXS_EGZPJ_E

JZER_E

SXCI_I

IMPZ_ESNCH_I

SHIA_I

LNCD_IJLAD_H

CTSC_F

SNLS_F

NESC_FGYSC_F

HYSC_FCJSC_F

HTSC_F

SWSC_F

SHZH_ISZBK_F

CMSB_F

CMBK_F

SHPD_FHXBK_F

CHRW_I

HBIS_M

SXTG_M

GJHX_M

AGST_M

BSIS_MWHIS_M

IMBT_M

CHPC_E

CIMC_I

CSSC_I

OFFS_EBEJU_F

BJCT_FCHMT_F

GMDL_FOWRG_F

CHVK_FFCSH_F

SNUS_F

JSZN_I

XHZB_FXMCD_I

SHZJ_F

SHCT_F

CGZB_I

SHCG_I

XCMG_IGXLG_ISHIT_I

ZLHS_ISTCM_I

TSJD_MAHCC_M

CATL_I

JLYT_M

CHBA_F

TYHI_I

ZJJH_MBDTW_I

CSGH_MYTWH_M

HNAL_ICSAL_I

CHEA_IDFET_I

TBEA_I

NRTE_I

LXVC_I

ZJNH_H

ZJMC_H

GXWZ_H

SHFS_HBJSL_H

HRBP_H

TSLP_HYNBY_H

JSHM_HCRSJ_HCRDP_H

KMPH_H HLBE_H

ZJHS_H

SDDE_H

BJTR_H

NCPH_H

HYPC_MQHSL_M

GFSC_F

CAAE_IPGGS_M

YYTH_MXAAE_I

GHEG_F

CHSS_I

0.1

Figure 10: SplitsTree network for study period four with the stocks in the mate-rials sector coloured green.

23

Page 26: An Application of Correlation Clustering to Portfolio Diversification

HJTG_ESHZJ_F

KMPH_H

BJSL_HHLBE_H

CHRW_I

BJTR_HCSSC_I

ADTM_M

TYCG_EZJJH_M

CSGH_M

BDTW_I

XAAE_I

CMSB_FTLNM_M

SDDE_H

SXCI_I

SHZH_ISCHD_M

SZLN_MJCPR_M

BJTN_M

SDNS_M

XMTS_MYTIN_M

YNCP_M

OFFS_ENRTE_I

TSLP_H

JXHD_I

GXWZ_H

SNCH_IJLAD_H

LNCD_ICTSC_F

SHDT_EHNZF_M

YZCM_E

CCFM_MHNSH_M

JZER_ESXXS_E

SXLH_EYQCI_E

GZPJ_EGHEG_F

YNAL_M

BSRE_M

CHSS_I

STCM_IMMTL_I

XMCD_ISHIT_I

GXLG_ISHCT_F

GJHX_MJSZN_IIMPZ_E

YYTH_M

XXDI_M

TBEA_IYTWH_M

QHSL_M

HBIS_M

IMBT_M

BSIS_M

ZJNH_HPGGS_MSHCG_I

HTSC_F

CRDP_H

ZJHS_H

JSHM_H

ZLHS_IHYSC_F

BJCT_FSHFS_H

SWSC_F

CIMC_I

SNUS_F

YNBY_H

AHCC_MSHIA_I

SXTG_MCGZB_I

AGST_MWHIS_M

LXVC_ITSJD_M

CHBA_F

JLYT_M

NESC_F

FCNM_M

SNLS_FHYPC_M

CAAE_IRSNM_M

HNAL_ICHEA_I

CSAL_I

NCPH_H

GYSC_FXCMG_I

SZBK_FCMBK_F

SHPD_FHXBK_F

FCSH_F

CHMT_F

CHVK_FGMDL_F

CHPC_ECJSC_F

OWRG_F

HRBP_H

CATL_I

TYHI_I

CRSJ_HHBYH_M

DFET_IXAAI_I

GFSC_F

ZJMC_H

XHZB_FBEJU_F

0.1

Figure 11: SplitsTree network for study period one with the stocks in the energysector coloured black.

24

Page 27: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

HBYH_MCATL_I

TSLP_H

HRBP_HLXVC_I

SNCH_IHNAL_I

CHEA_I

XCMG_ICRDP_H

BDTW_ISXCI_I

ZJNH_H

ZJMC_HGZPJ_E

SHDT_E

JZER_E

YQCI_E

SXXS_E

SXLH_E

YZCM_E

HNSH_M

HBIS_MCSGH_M

IMBT_MGJHX_M

TSJD_MXXDI_M

SHCT_F

TYCG_EZJJH_M

CAAE_INCPH_H

PGGS_M

JSZN_I

HYPC_M

GHEG_F

JXHD_I

NESC_F

XAAI_I

RSNM_M

GFSC_F

SWSC_FCJSC_F

SDDE_H

XAAE_I

GYSC_F

SNLS_FCSSC_I

DFET_I

JLAD_H

LNCD_IHYSC_F

CTSC_F

HXBK_FCMSB_F

SHPD_F

CMBK_F

SZBK_FTBEA_I

CSAL_ICHPC_E

WHIS_MBSIS_M

AGST_MQHSL_M

CIMC_ISXTG_M

OFFS_E

YTWH_MYNBY_H

BJTR_HSHIA_I

JSHM_H

SHIT_IZLHS_I

AHCC_M

GXLG_I

SHZH_I

HLBE_HSTCM_I

BSRE_M

BJTN_MSDNS_M

YNAL_M

TLNM_M

YNCP_M

JCPR_MSZLN_M

SCHD_MYTIN_M

XMTS_MCCFM_M

MMTL_IYYTH_M

CGZB_IBJSL_H

HNZF_M

SNUS_F

HTSC_F

FCNM_M

CHRW_I

XMCD_I

BJCT_F

OWRG_FCHMT_F

GMDL_F

CHVK_F

FCSH_F

XHZB_F

BEJU_F

SHZJ_FCHBA_F

CHSS_I

SHFS_HZJHS_H

GXWZ_HNRTE_I

TYHI_I IMPZ_ECRSJ_H

KMPH_H

ADTM_MSHCG_I

JLYT_M

0.1

Figure 12: SplitsTree network for study period two with the stocks in the energysector coloured black.

25

Page 28: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAE_I

YYTH_MQHSL_M

RSNM_M

PGGS_M

CAAE_IGFSC_F

HYPC_M

NCPH_HBJTR_H

SDDE_H

ZJHS_H

HLBE_H

TSLP_HBJSL_H

HRBP_H

YNBY_H

JSHM_HCRSJ_H

CRDP_H

KMPH_H

SHFS_H

GXWZ_H

ZJMC_H

ZJNH_HADTM_M

MMTL_I

CCFM_MXMTS_M

BSRE_M

BJTN_M

FCNM_M

HBYH_M

YTWH_M

SHIA_IOFFS_E

SNCH_I

IMPZ_E

SXLH_EYQCI_E

SXXS_E

YZCM_E

SHDT_EHNSH_M

TYCG_E

GZPJ_E

JZER_E

SXCI_I

SCHD_MYNCP_M

TLNM_MJCPR_M

SZLN_M

YTIN_M

SDNS_MYNAL_M

HNZF_M

XXDI_MCSSC_ICIMC_I

CHPC_EIMBT_M

WHIS_MBSIS_MAGST_M

GJHX_MSXTG_M

HBIS_M

SZBK_F

CMSB_FCMBK_F

SHPD_FHXBK_F

CHRW_ISHZH_I

SWSC_F

HTSC_F

CJSC_FHYSC_FGYSC_F

NESC_FSNLS_F

CTSC_FJLAD_H

LNCD_I

CHBA_F

TYHI_IJLYT_M

XCMG_I

SHIT_IZLHS_I

GXLG_ISTCM_I

TSJD_M

AHCC_M

CATL_I

SHCG_ICGZB_I

SHCT_FSHZJ_F

XMCD_I

XHZB_F

JSZN_I

FCSH_FCHVK_F

OWRG_FGMDL_F

CHMT_F

BJCT_FSNUS_F

BEJU_F HNAL_I

CSAL_ICHEA_I LXVC_I NRTE_I

TBEA_IDFET_I

BDTW_I

CSGH_M

ZJJH_M XAAI_IJXHD_I

GHEG_FCHSS_I

0.1

Figure 13: SplitsTree network for study period three with the stocks in the energysector coloured black.

26

Page 29: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAI_I

JXHD_I

RSNM_MCCFM_M

MMTL_IXMTS_M

BSRE_M

ADTM_MBJTN_M

FCNM_MHBYH_M

XXDI_M

HNZF_MYNAL_M

SDNS_M

YTIN_MSZLN_M

JCPR_MTLNM_M

YNCP_M

SCHD_M

YQCI_E

YZCM_E

SHDT_ESXLH_EHNSH_M

TYCG_ESXXS_E

GZPJ_EJZER_E

SXCI_I

IMPZ_ESNCH_I

SHIA_I

LNCD_IJLAD_H

CTSC_FSNLS_F

NESC_F

GYSC_FHYSC_F

CJSC_F

HTSC_F

SWSC_F

SHZH_I

SZBK_FCMSB_F

CMBK_FSHPD_F

HXBK_F

CHRW_IHBIS_M

SXTG_MGJHX_M

AGST_M

BSIS_MWHIS_MIMBT_M

CHPC_E

CIMC_I

CSSC_IOFFS_E

BEJU_FBJCT_F

CHMT_F

GMDL_FOWRG_F

CHVK_FFCSH_FSNUS_F

JSZN_I

XHZB_F XMCD_ISHZJ_F

SHCT_F

CGZB_ISHCG_I

XCMG_I

GXLG_I

SHIT_IZLHS_I

STCM_I

TSJD_M

AHCC_M

CATL_I

JLYT_M

CHBA_F

TYHI_I

ZJJH_M

BDTW_I

CSGH_MYTWH_M

HNAL_ICSAL_I

CHEA_I

DFET_I

TBEA_I

NRTE_I

LXVC_I

ZJNH_HZJMC_H

GXWZ_H

SHFS_H

BJSL_HHRBP_H

TSLP_H

YNBY_HJSHM_H

CRSJ_H

CRDP_H

KMPH_H

HLBE_H

ZJHS_H

SDDE_H

BJTR_HNCPH_H

HYPC_MQHSL_M

GFSC_FCAAE_I

PGGS_M

YYTH_MXAAE_I

GHEG_FCHSS_I

0.1

Figure 14: SplitsTree network for study period four with the stocks in the energysector coloured black.

27

Page 30: An Application of Correlation Clustering to Portfolio Diversification

References

Bai, Y. and C. J. Green (2010). International Diversification Strategies: Revisitedfrom the Risk Perspective. The Journal of Banking and Finance 34, 236–245.

Bali, T. G., N. Cakici, X. Yan, and Z. Zhang (2005). Does Idiosyncratic RiskReally Matter? The Journal of Finance 60 (2), 905–929.

Barber, B. M. and T. Odean (2008). All That Glitters: The Effect of Attentionand News on the Buying Behaviour of Individual and Institutional Investors.The Review of Financial Studies 21 (2), 785–818.

Benzoni, L., P. Collin-Dufresne, and R. S. Goldstein (2007). Portfolio Choiceover the Life-Cycle when the Stock and Labor Markets are Cointegrated. TheJournal of Finance 62 (5), 2123–2167.

Bryant, D. and V. Moulton (2004). Neighbor-net: An agglomerative methodfor the construction of phylogenetic networks. Molecular Biology and Evolu-tion 21 (2), 255–265.

Cont, R. (2001). Empirical properties of asset returns: stylized facts and statis-tical issues. Quantitative Finance 1:2, 223–236.

Domian, D. L., D. A. Louton, and M. D. Racine (2007). Diversification in Port-folios of Individual Stocks: 100 Stocks Are Not Enough. The Financial Re-view 42, 557–570.

Evans, J. L. and S. H. Archer (1968). Diversification and the Reduction ofDispersion: An Empirical Analysis. The Journal of Finance 23 (5), 761–767.

Fama, E. F. and K. R. French (1992). The Cross-Section of Expected StockReturns. The Journal of Finance 47 (2), 427–465.

French, K. R. and E. F. Fama (1993). Common Risk Factors in the Returns onStocks and Bonds. Journal of Financal Economics 33, 3–56.

Gastwirth, J. L., Y. R. Gel, W. L. W. Hui, V. Lyubchich, W. Miao, andK. Noguchi (2013). lawstat: An R package for biostatistics, public policy, andlaw. R package version 2.4.1.

Goyal, A. and P. Santa-Clara (2003). Idiosyncratic Risk Matters! The Journalof Finance 58 (3), 975–1007.

Jorion, P. (1985). International Portfolio Diversification with Estimation Risk.The Journal of Business 58 (3), 259–278.

Lee, W. (2011). Risk-Based Asset Allocation: A New Answer to an Old Question?Journal of Portfolio Management 37 (4), 11–28.

28

Page 31: An Application of Correlation Clustering to Portfolio Diversification

Lowenfeld, H. (1909). Investment, an Exact Science. Financial Review of Re-views.

Mantegna, R. N. (1999). Hierarchical structure in financial markets. The Euro-pean Physical Journal B 11, 193–197.

Markowitz, H. M. (1991). Portfolio Selection: Efficient Diversification of Invest-ments 2nd Edition. Wiley.

Markowtiz, H. (1952). Portfolio Selection. The Journal of Finance 7 (1), 77–91.

Rea, A. and W. Rea (2014). Visualization of a stock market correlation matrix.Physica A 400, 109–123.

29

Page 32: An Application of Correlation Clustering to Portfolio Diversification

A Stock Codes and Industry Segments

Table 5: Stock market codes and company names andIndustrial sector of stocks in the study.

Company Name Company Code Industry GroupChina Ptl. & Chm. CHPC-E EnergyGuizhou Panjiang Coal GZPJ-E EnergyInner Mongolia Pingzhuang En. Rso. IMPZ-E EnergyJizhong Energy Res. JZER-E EnergyLiaoning Hjtg. Chems. HJTG-E EnergyOffs. Oil Engr. OFFS-E EnergyShai Datun Energy Res. SHDT-E EnergyShanxi Lanhua Sci-Tech Venture SXLH-E EnergyShanxi Xishan SXXS-E EnergyTaiyuan Coal Gasification TYCG-E EnergyYangquan Coal YQCI-E EnergyYanzhou Coal Mining YZCM-E EnergyBeijing Capital Dev. BJCT-F FinanceBej. Urban Con. Inv. Dev. BEJU-F FinanceChangjiang Securities CJSC-F FinanceChina Baoan Gp. CHBA-F FinanceChina Merchants Bank CMBK-F FinanceChina Merchants Pr. Dev. CHMT-F FinanceChina Minsheng Banking CMSB-F FinanceChina Vanke CHVK-F FinanceCitic Securities CTSC-F FinanceFinancial Str. Sldg. FCSH-F FinanceGemdale GMDL-F FinanceGF Securities GFSC-F FinanceGuanghui Energy GHEG-F FinanceGuoyuan Securities GYSC-F FinanceHaitong Securities HTSC-F FinanceHong Yuan Secs. HYSC-F FinanceHuaxia Bank HXBK-F FinanceNortheast Securities NESC-F FinanceOceanwide Rlst. Group OWRG-F FinanceShai. Chengtou Hldg. SHCT-F FinanceShai. Pudong Dev. Bk. SHPD-F FinanceShai. Zhangjiang SHZJ-F FinanceShenzhen Dev. Bank SZBK-F FinanceSinolink Securities SNLS-F Finance

30

Page 33: An Application of Correlation Clustering to Portfolio Diversification

Table 5: Stock market codes and company names andIndustrial sector of stocks in the study.

Company Name Company Code Industry GroupSouthwest Securities SWSC-F FinanceSuning Universal SNUS-F FinanceXinhu Zhongbao XHZB-F FinanceBeijing Sl Pharmaceutical BJSL-H Health CareBeijing Tongrentang BJTR-H Health CareChina Res. Dble. Crane Pharm. CRDP-H Health CareChina Res. Sanjiu Med.& pharm. CRSJ-H Health CareGuangxi Wuzhou Zhongheng GXWZ-H Health CareHarbin Pharms. Gp. HRBP-H Health CareHualan Biological Engr. HLBE-H Health CareJiangsu Hengrui Medicine JSHM-H Health CareJilin Aodong Pharm. Gp. JLAD-H Health CareKangmei Pharm. KMPH-H Health CareNorth China Pharm. NCPH-H Health CareShai. Fosun Pharm. Group SHFS-H Health CareShan Dong Dong E-Jiao SDDE-H Health CareTasly Pharmaceutical TSLP-H Health CareYunnan Baiyao Gp. YNBY-H Health CareZhejiang Hisun Pharm. ZJHS-H Health CareZhejiang Medicine ZJMC-H Health CareZhejiang Nhu ZJNH-H Health CareBaoding Tianwei Baobian Elec. BDTW-I IndustrialChina Avic Avionics Equ. CAAE-I IndustrialChina Cssc Hdg. CSSC-I IndustrialChina Eastern Airl. CHEA-I IndustrialChina Gezhouba Group CGZB-I IndustrialChina Intl.Mar.Ctrs. CIMC-I IndustrialChina Railway Erju CHRW-I IndustrialChina Railway Tielong Container Logistic CATL-I IndustrialChina Southern Airlines CSAL-I IndustrialChina Spacesat CHAA-I IndustrialDongfang Electric DFET-I IndustrialGuangxi Liugong Mch GXLG-I IndustrialHainan Airlines HNAL-I IndustrialJiangsu Zhongnan Con. JSZN-I IndustrialJiangxi Hongdu Aviation JXHD-I IndustrialLiaoning Chengda LNCD-I IndustrialLuxin Venture Cap. Gp. LXVC-I IndustrialMinmetals Dev. MMTL-I Industrial

31

Page 34: An Application of Correlation Clustering to Portfolio Diversification

Table 5: Stock market codes and company names andIndustrial sector of stocks in the study.

Company Name Company Code Industry GroupNari Tech. Dev. NRTE-I IndustrialSany Heavy Industry SHIT-I IndustrialShai. Shenhua Heavy Ind. SHZH-I IndustrialShanghai Con. Group SHCG-I IndustrialShanghai Intl. Arpt. SHIA-I IndustrialShantui Con. Mch. STCM-I IndustrialShanxi Coal Intl. SXCI-I IndustrialSinochem Intl. SNCH-I IndustrialTaiyuan Hvy. Ind. TYHI-I IndustrialTbea TBEA-I IndustrialXcmg Con. Machinery XCMG-I IndustrialXi’an Aero-Engine XAAE-I IndustrialXi’an Air.Intl. XAAI-I IndustrialXiamen C & D XCMD-I IndustrialZoomlion Hdy. Sctc. ZLHS-I IndustrialAdvd. Tech.& Mats. ADTM-M MaterialsAngang Steel AGST-M MaterialsAnhui Conch Cmt. AHCC-M MaterialsBaoji Titanium Ind. BJTN-M MaterialsBaoshan Iron & Stl. BSIS-M MaterialsChina Nonferrous Mtl. CCFM-M MaterialsCsg Holding CSGH-M MaterialsFangda Cbn. New Mra. FCNM=M MaterialsGan Jiu Stl. Gp. Hongxing GJHX-M MaterialsGinghai Salt Lake Ind. QHSL-M MaterialsIndustrial Sichuan Hongda SCHD-M MaterialsInmong. Baotou Stl. Rare Earth BSRE-M MaterialsHebei Iron & Steel HBIS-M MaterialsHenan Shenhuo Caa. & Pwr. HNSH-M MaterialsHenan Zhongfu Indl. HNZF-M MaterialsHengyi Petrochemical HYPC-M MaterialsHubei Yihua Chm. Ind. HBYH-M MaterialsInner Mongolia Baotou Steel Union IMBT-M MaterialsJiangxi Cpr. JCPR-M MaterialsJilin Yatai Group JLYT-M MaterialsPangang Gp. Stl. Vmtm. PGGS-M MaterialsRising Nonfr. Mtls RSNM-M MaterialsShandong Nanshan Almn. SDNS-M MaterialsShanxi Taigang Stl. SXTG-M Materials

32

Page 35: An Application of Correlation Clustering to Portfolio Diversification

Table 5: Stock market codes and company names andIndustrial sector of stocks in the study.

Company Name Company Code Industry GroupShn. Zhongjin Lingnan Nonfemet SZLN-M MaterialsTangshan Jidong Cmt. TSJD-M MaterialsTongling Nonfr. Mtls. Gp. TLNM-M MaterialsXiamen Tungsten XMTS-M MaterialsXinxing Ductile Iron XXDI-M MaterialsYantai Wanhua Polyuretha YTWH-M MaterialsYunnan Alum. YNAL-M MaterialsYunnan Copper YNCP-M MaterialsYunnan Tin YTIN-M MaterialsYunnan Yuntianhua YYTH-M MaterialsWuhan Iron and Steel WHIS-M MaterialsZhejiang Juhua ZJJH-M Materials

33

Page 36: An Application of Correlation Clustering to Portfolio Diversification

B Stocks in Each Cluster

B.1 Period 1

Cluster1: YZCM E, SXCI I, OFFS E, SCHD M, JCPR M, SHZH I, YNCP M,JXHD I, CCFM M, BJTN M, GXWZ H, SNCH I, XMTS M, YTIN M,HNZF M, JLAD H, TSLP H, LNCD I, NRTE I, CTSC F, SZLN M, SHDT E,SDNS M

Cluster2: XAAE I, CSSC I, TYCG E, CSGH M, ADTM M, SDDE H, CMSB F,BDTW I, CHRW I, BJTR H, TLNM M, ZJJH M

Cluster3: BEJU F, KMPH H, ZJMC H, HJTG E, SHZJ F, XAAI I, TYHI I,CATL I, BJSL H, CRSJ H, DFET I, HLBE H, XHZB F, HBYH M, HRBP H,GFSC F

Cluster4: CMBK F, CJSC F, OWRG F, HXBK F, CHMT F, FCSH F, SZBK F,GMDL F, CHPC E, SHPD F, CHVK F

Cluster5: JLYT M, CSAL I, SNUS F, NESC F, AHCC M, ZLHS I, HNAL I,XCMG I, GYSC F, SXTG M, BJCT F, ZJHS H, JSHM H, CRDP H, CGZB I,FCNM M, SNLS F, TSJD M, YNBY H, WHIS M, SHFS H, CHEA I, CAAE I,HYPC M, CHBA F, SWSC F, HYSC F, CIMC I, AGST M, RSNM M,SHIA I, NCPH H, LXVC I

Cluster6: BSIS M, HTSC F, HBIS M, SHCG I, PGGS M, IMBT M, TBEA I,QHSL M, ZJNH H, YTWH M

Cluster7: YYTH M, XMCD I, CHSS I, JSZN I, XXDI M, GXLG I, SHIT I,SHCT F, STCM I, GJHX M, IMPZ E, MMTL I

Cluster8: YNAL M, SXLH E, BSRE M, JZER E, SXXS E, YQCI E, GHEG F,HNSH M, GZPJ E

B.2 Period 2

Cluster1 XXDI M, SXCI I, KMPH H, ZJMC H, JLYT M, SHCT F, IMPZ E,ZJNH H, HJTG E, BDTW I, GHEG F, TYHI I, SXLH E, TYCG E, HNAL I,IMBT M, TSLP H, SXXS E, NRTE I, XCMG I, ZJHS H, HBYH M, HRBP H,HNSH M, CRDP H, JSZN I, YZCM E, SHCG I, TSJD M, CSGH M, ADTM M,PGGS M, SHFS H, CHEA I, CAAE I, HYPC M, ZJJH M, HBIS M, GXWZ H,CATL I, SNCH I, GZPJ E, CRSJ H, JZER E, NCPH H, LXVC I, GJHX M,SHDT E

34

Page 37: An Application of Correlation Clustering to Portfolio Diversification

Cluster2 CHSS I, OWRG F, BEJU F, CHMT F, FCSH F, GMDL F, XHZB F,BJCT F, CHBA F, SHZJ F, CHVK F

Cluster3 CGZB I, FCNM M, JCPR M, SCHD M, YNCP M, SNUS F, CHRW I,TLNM M, CCFM M, BJTN M, HTSC F, XMCD I, YNAL M, YYTH M,BJSL H, XMTS M, BSRE M, YTIN M, HNZF M, MMTL I, SZLN M, SDNS M

Cluster4 CMBK F, GXLG I, SHZH I, CSAL I, TBEA I, QHSL M, BJTR H,BSIS M, AHCC M, SHIT I, ZLHS I, STCM I, LNCD I, SXTG M, YTWH M,JSHM H, HXBK F, OFFS E, SZBK F, YNBY H, WHIS M, CMSB F, SHPD F,HYSC F, CIMC I, AGST M, SHIA I, JLAD H, HLBE H, CHPC E, CTSC F

Cluster5, CJSC F, SNLS F, SDDE H, JXHD I, SWSC F, XAAI I, NESC F,XAAE I, CSSC I, RSNM M, DFET I, GYSC F, GFSC F

B.3 Period, 3

Cluster1 FCNM M, SNCH I, XMTS M, SHIA I, BSRE M, OFFS E, MMTL I,YTWH M, CCFM M, HBYH M, BJTN M

Cluster2 JSHM H, CRDP H, KMPH H, ADTM M, ZJMC H, SDDE H, YNBY H,SHFS H, ZJNH H, BJTR H, GXWZ H, BJSL H, CRSJ H, NCPH H, TSLP H,HLBE H, ZJHS H, HRBP H

Cluster3 CSGH M, PGGS M, CSAL I, TBEA I, QHSL M, CHEA I, CAAE I,HYPC M, HJTG E, BDTW I, GHEG F, JXHD I, XAAI I, ZJJH M, YYTH M,CHSS I, XAAE I, RSNM M, HNAL I, LXVC I, DFET I, NRTE I, GFSC F

Cluster4 CGZB I, JSZN I, BEJU F, SHCG I, GXLG I, CHMT F, TSJD M,FCSH F, SHCT F, SNUS F, GMDL F, SHZJ F, XMCD I, OWRG F, CATL I,AHCC M, SHIT I, ZLHS I, STCM I, XHZB F, XCMG I, BJCT F, CHVK F

Cluster5 CMBK F, CJSC F, JLYT M, SHZH I, TYHI I, NESC F, BSIS M,HTSC F, IMBT M, LNCD I, GYSC F, SXTG M, HXBK F, SNLS F, SZBK F,WHIS M, CMSB F, CHRW I, SHPD F, CHBA F, SWSC F, HYSC F, HBIS M,CIMC I, CSSC I, AGST M, JLAD H, GJHX M, CHPC E, CTSC F

Cluster6 YZCM E, XXDI M, SXCI I, JCPR M, SCHD M, YNCP M, IMPZ E,YQCI E, TLNM M, YNAL M, SXLH E, TYCG E, GZPJ E, YTIN M, JZER E,HNZF M, SXXS E, SZLN M, SDNS M, HNSH M, SHDT E

35

Page 38: An Application of Correlation Clustering to Portfolio Diversification

B.4 Period 4

Cluster1 YZCM E, XXDI M, SXCI I, JCPR M, SCHD M, YNCP M, IMPZ E,YQCI E, TLNM M, YNAL M, SXLH E, TYCG E, GZPJ E, YTIN M, JZER E,HNZF M, SXXS E, SZLN M, SDNS M, HNSH M, SHDT E

Cluster2 FCNM M, ADTM M, PGGS M, QHSL M, CAAE I, HJTG E, HYPC M,GHEG F, JXHD I, XAAI I, CCFM M, BJTN M, YYTH M, CHSS I, XAAE I,RSNM M, XMTS M, BSRE M, MMTL I, HBYH M, GFSC F

Cluster3 JSHM H, CRDP H, KMPH H, ZJMC H, SDDE H, YNBY H, SHFS H,ZJNH H, BJTR H, GXWZ H, BJSL H, CRSJ H, NCPH H, LXVC I, TSLP H,NRTE I, HLBE H, ZJHS H, HRBP H

Cluster4 CSGH M, HNAL I, CSAL I, DFET I, TBEA I, CHEA I, BDTW I,YTWH M, ZJJH M

Cluster5 CGZB I, GXLG I, SHCG I, TSJD M, JLYT M, , CHBA F, TYHI I,CATL I, AHCC M, SHIT I, ZLHS I, STCM I, XCMG I

Cluster6 JSZN I, BEJU F, CHMT F, FCSH F, SHCT F, SNUS F, GMDL F,SHZJ F, XMCD I, OWRG F, XHZB F, BJCT F, CHVK F

Cluster7 CMBK F, CJSC F, SHZH I, NESC F, BSIS M, HTSC F, IMBT M,LNCD I, GYSC F, SXTG M, HXBK F, SNLS F, OFFS E, SZBK F, WHIS M,CMSB F, CHRW I, SHPD F, SWSC F, HYSC F, HBIS M, CIMC I, CSSC I,SNCH I, AGST M, SHIA I, JLAD H, GJHX M, CHPC E, CTSC F

C Extra Neighbor-Nets Splits Graphs

C.1 Period 2 Clusters in Period 3

36

Page 39: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EHBYH_MCATL_I

TSLP_HHRBP_H

LXVC_ISNCH_IHNAL_I

CHEA_I

XCMG_ICRDP_H

BDTW_I

SXCI_IZJNH_H

ZJMC_HGZPJ_E

SHDT_EJZER_E

YQCI_ESXXS_E

SXLH_E

YZCM_E

HNSH_MHBIS_M

CSGH_M

IMBT_MGJHX_M

TSJD_MXXDI_M

SHCT_F

TYCG_EZJJH_M

CAAE_INCPH_H

PGGS_M

JSZN_I

HYPC_M

GHEG_F

JXHD_I

NESC_F

XAAI_IRSNM_M

GFSC_F

SWSC_FCJSC_F

SDDE_H

XAAE_I

GYSC_F

SNLS_FCSSC_I

DFET_I

JLAD_HLNCD_I

HYSC_FCTSC_F

HXBK_FCMSB_F

SHPD_FCMBK_F

SZBK_FTBEA_I

CSAL_ICHPC_E

WHIS_MBSIS_M

AGST_MQHSL_M

CIMC_I

SXTG_MOFFS_E

YTWH_MYNBY_HBJTR_H

SHIA_I

JSHM_HSHIT_I

ZLHS_I

AHCC_M

GXLG_I

SHZH_IHLBE_H

STCM_I

BSRE_M

BJTN_MSDNS_M

YNAL_MTLNM_M

YNCP_M

JCPR_MSZLN_M

SCHD_M

YTIN_M

XMTS_M

CCFM_M

MMTL_I

YYTH_M

CGZB_I

BJSL_HHNZF_M

SNUS_F

HTSC_F

FCNM_M

CHRW_I

XMCD_I

BJCT_F

OWRG_FCHMT_F

GMDL_F

CHVK_F

FCSH_F

XHZB_F

BEJU_F

SHZJ_FCHBA_F

CHSS_I

SHFS_HZJHS_HGXWZ_H

NRTE_I

TYHI_IIMPZ_ECRSJ_HKMPH_H

ADTM_MSHCG_I

JLYT_M

0.1

Figure 15: SplitsTree network for 126 stocks from the Shanghai A Stock Exchangefor period two showing the five identified clusters. The colours are Cluster 1 –Black, Cluster 2 – Blue, Cluster 3 – Pink, Cluster 4 – Green, and Cluster 5 –Red.

37

Page 40: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAE_I

YYTH_MQHSL_M

RSNM_M

PGGS_M

CAAE_IGFSC_F

HYPC_MNCPH_H

BJTR_H

SDDE_H

ZJHS_H

HLBE_HTSLP_H

BJSL_H

HRBP_HYNBY_HJSHM_H

CRSJ_HCRDP_H

KMPH_H

SHFS_H

GXWZ_H

ZJMC_H

ZJNH_HADTM_M

MMTL_ICCFM_M

XMTS_M

BSRE_M

BJTN_M

FCNM_M

HBYH_M

YTWH_M

SHIA_I

OFFS_E

SNCH_I

IMPZ_E

SXLH_E

YQCI_E

SXXS_E

YZCM_E

SHDT_E

HNSH_MTYCG_E

GZPJ_E

JZER_E

SXCI_I

SCHD_M

YNCP_M

TLNM_MJCPR_MSZLN_M

YTIN_MSDNS_M

YNAL_M

HNZF_M

XXDI_MCSSC_I

CIMC_ICHPC_E

IMBT_MWHIS_MBSIS_M

AGST_M

GJHX_MSXTG_MHBIS_M

SZBK_F

CMSB_FCMBK_F

SHPD_FHXBK_F CHRW_I

SHZH_I

SWSC_F

HTSC_F

CJSC_FHYSC_F

GYSC_F

NESC_F

SNLS_F

CTSC_F

JLAD_HLNCD_I

CHBA_F

TYHI_I

JLYT_M

XCMG_ISHIT_I

ZLHS_I

GXLG_I

STCM_I

TSJD_M

AHCC_M

CATL_I

SHCG_ICGZB_I

SHCT_FSHZJ_F

XMCD_IXHZB_F

JSZN_IFCSH_FCHVK_F

OWRG_FGMDL_F

CHMT_F

BJCT_FSNUS_F

BEJU_F HNAL_ICSAL_I

CHEA_I

LXVC_INRTE_I

TBEA_IDFET_I

BDTW_ICSGH_M

ZJJH_MXAAI_IJXHD_I

GHEG_FCHSS_I

0.1

Figure 16: SplitsTree network for 126 stocks for period 3 showing the periodtwo cluster 1 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

38

Page 41: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

XAAE_I

YYTH_MQHSL_M

RSNM_M

PGGS_M

CAAE_IGFSC_F

HYPC_M

NCPH_HBJTR_H

SDDE_H

ZJHS_H

HLBE_H

TSLP_HBJSL_H

HRBP_H

YNBY_HJSHM_H

CRSJ_HCRDP_H

KMPH_H

SHFS_H

GXWZ_H

ZJMC_H

ZJNH_HADTM_M

MMTL_I

CCFM_M

XMTS_M

BSRE_M

BJTN_M

FCNM_M

HBYH_M

YTWH_M

SHIA_I

OFFS_E

SNCH_I

IMPZ_ESXLH_E

YQCI_E

SXXS_EYZCM_E

SHDT_E

HNSH_MTYCG_E

GZPJ_E

JZER_ESXCI_I

SCHD_MYNCP_M

TLNM_MJCPR_MSZLN_M

YTIN_MSDNS_M

YNAL_M

HNZF_M

XXDI_MCSSC_ICIMC_I

CHPC_EIMBT_M

WHIS_MBSIS_MAGST_M

GJHX_MSXTG_M

HBIS_M

SZBK_F

CMSB_FCMBK_F

SHPD_FHXBK_F CHRW_I

SHZH_I

SWSC_F

HTSC_F

CJSC_FHYSC_F

GYSC_F

NESC_F

SNLS_F

CTSC_F

JLAD_HLNCD_I

CHBA_FTYHI_I

JLYT_M

XCMG_I

SHIT_IZLHS_I

GXLG_I

STCM_I

TSJD_M

AHCC_M

CATL_I

SHCG_I

CGZB_ISHCT_F

SHZJ_FXMCD_I

XHZB_F

JSZN_I

FCSH_F

CHVK_FOWRG_FGMDL_F

CHMT_F

BJCT_FSNUS_F

BEJU_FHNAL_I

CSAL_ICHEA_I LXVC_I NRTE_I

TBEA_IDFET_I

BDTW_I

CSGH_M

ZJJH_M XAAI_IJXHD_I

GHEG_F

CHSS_I

0.1

Figure 17: SplitsTree network for 126 stocks for period 3 showing the periodtwo cluster 2 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

39

Page 42: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

XAAE_I

YYTH_MQHSL_M

RSNM_M

PGGS_M

CAAE_IGFSC_F

HYPC_M

NCPH_HBJTR_H

SDDE_H

ZJHS_H

HLBE_H

TSLP_H

BJSL_H

HRBP_H

YNBY_H

JSHM_HCRSJ_H

CRDP_H

KMPH_H

SHFS_H

GXWZ_H

ZJMC_H

ZJNH_HADTM_M

MMTL_ICCFM_M

XMTS_M

BSRE_M

BJTN_M

FCNM_M

HBYH_M

YTWH_M

SHIA_I

OFFS_E

SNCH_I

IMPZ_ESXLH_E

YQCI_E

SXXS_E

YZCM_ESHDT_E

HNSH_M

TYCG_EGZPJ_E

JZER_ESXCI_I

SCHD_M

YNCP_MTLNM_M

JCPR_MSZLN_M

YTIN_M

SDNS_M

YNAL_M

HNZF_M

XXDI_MCSSC_I

CIMC_ICHPC_E

IMBT_M

WHIS_MBSIS_MAGST_M

GJHX_MSXTG_M

HBIS_M

SZBK_FCMSB_FCMBK_F

SHPD_F

HXBK_F

CHRW_I

SHZH_I

SWSC_F

HTSC_F

CJSC_FHYSC_FGYSC_F

NESC_FSNLS_F

CTSC_F

JLAD_HLNCD_I

CHBA_F

TYHI_IJLYT_M

XCMG_I

SHIT_IZLHS_I

GXLG_ISTCM_I

TSJD_M

AHCC_M

CATL_I

SHCG_I

CGZB_ISHCT_F

SHZJ_FXMCD_I

XHZB_F

JSZN_I

FCSH_FCHVK_F

OWRG_F

GMDL_F

CHMT_F

BJCT_F

SNUS_FBEJU_F HNAL_I

CSAL_ICHEA_I LXVC_I NRTE_I

TBEA_IDFET_I

BDTW_I

CSGH_M

ZJJH_M XAAI_IJXHD_I

GHEG_FCHSS_I

0.1

Figure 18: SplitsTree network for 126 stocks for period 3 showing the periodtwo cluster 3 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

40

Page 43: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

XAAE_I

YYTH_M

QHSL_MRSNM_M

PGGS_M

CAAE_IGFSC_F

HYPC_MNCPH_H

BJTR_HSDDE_H

ZJHS_H

HLBE_H

TSLP_H

BJSL_H

HRBP_H

YNBY_H

JSHM_HCRSJ_H

CRDP_HKMPH_H

SHFS_H

GXWZ_H

ZJMC_H

ZJNH_HADTM_M

MMTL_I

CCFM_M

XMTS_M

BSRE_M

BJTN_M

FCNM_M

HBYH_M

YTWH_M

SHIA_IOFFS_E

SNCH_I

IMPZ_ESXLH_E

YQCI_E

SXXS_EYZCM_E

SHDT_E

HNSH_MTYCG_E

GZPJ_E

JZER_ESXCI_I

SCHD_M

YNCP_MTLNM_M

JCPR_MSZLN_M

YTIN_M

SDNS_MYNAL_M

HNZF_M

XXDI_MCSSC_I

CIMC_I

CHPC_E

IMBT_M

WHIS_MBSIS_M

AGST_M

GJHX_M

SXTG_M

HBIS_MSZBK_F

CMSB_F

CMBK_F

SHPD_FHXBK_F

CHRW_I

SHZH_I

SWSC_F

HTSC_F

CJSC_F

HYSC_FGYSC_FNESC_F

SNLS_F

CTSC_F

JLAD_HLNCD_I

CHBA_F

TYHI_IJLYT_M

XCMG_I

SHIT_I

ZLHS_I

GXLG_I

STCM_I

TSJD_M

AHCC_M

CATL_I

SHCG_ICGZB_I

SHCT_F

SHZJ_FXMCD_I

XHZB_FJSZN_I

FCSH_FCHVK_F

OWRG_FGMDL_F

CHMT_F

BJCT_FSNUS_F

BEJU_F HNAL_I

CSAL_ICHEA_I LXVC_I NRTE_I

TBEA_I

DFET_I

BDTW_I

CSGH_M

ZJJH_M XAAI_IJXHD_I

GHEG_FCHSS_I

0.1

Figure 19: SplitsTree network for 126 stocks for period 3 showing the periodtwo cluster 4 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

41

Page 44: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

XAAE_IYYTH_M

QHSL_M

RSNM_M

PGGS_M

CAAE_I

GFSC_F

HYPC_M

NCPH_HBJTR_H

SDDE_H

ZJHS_H

HLBE_H

TSLP_HBJSL_H

HRBP_H

YNBY_H

JSHM_HCRSJ_H

CRDP_H

KMPH_H

SHFS_H

GXWZ_H

ZJMC_H

ZJNH_HADTM_M

MMTL_I

CCFM_M

XMTS_M

BSRE_M

BJTN_M

FCNM_M

HBYH_MYTWH_M

SHIA_IOFFS_E

SNCH_I

IMPZ_ESXLH_E

YQCI_E

SXXS_E

YZCM_ESHDT_E

HNSH_M

TYCG_EGZPJ_E

JZER_ESXCI_I

SCHD_M

YNCP_MTLNM_M

JCPR_MSZLN_M

YTIN_M

SDNS_MYNAL_M

HNZF_M

XXDI_MCSSC_ICIMC_I

CHPC_EIMBT_M

WHIS_MBSIS_MAGST_M

GJHX_MSXTG_M

HBIS_M

SZBK_FCMSB_FCMBK_F

SHPD_F

HXBK_F

CHRW_ISHZH_I

SWSC_F

HTSC_F

CJSC_F

HYSC_FGYSC_F

NESC_FSNLS_F

CTSC_F

JLAD_H

LNCD_I

CHBA_F

TYHI_IJLYT_M

XCMG_I

SHIT_IZLHS_I

GXLG_ISTCM_I

TSJD_M

AHCC_M

CATL_I

SHCG_ICGZB_I

SHCT_F

SHZJ_FXMCD_I

XHZB_F

JSZN_I

FCSH_FCHVK_F

OWRG_FGMDL_F

CHMT_F

BJCT_F

SNUS_F

BEJU_F HNAL_I

CSAL_ICHEA_I LXVC_I NRTE_I

TBEA_I

DFET_IBDTW_I

CSGH_M

ZJJH_M XAAI_IJXHD_I

GHEG_FCHSS_I

0.1

Figure 20: SplitsTree network for 126 stocks for period 3 showing the periodtwo cluster 5 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

42

Page 45: An Application of Correlation Clustering to Portfolio Diversification

C.2 Period 3 Clusters in Period 4

HJTG_E

XAAE_IYYTH_M

QHSL_M

RSNM_M

PGGS_M

CAAE_IGFSC_F

HYPC_M

NCPH_HBJTR_H

SDDE_H

ZJHS_H

HLBE_HTSLP_H

BJSL_H

HRBP_H

YNBY_H

JSHM_HCRSJ_H

CRDP_H

KMPH_H

SHFS_H

GXWZ_H

ZJMC_H

ZJNH_HADTM_M

MMTL_I

CCFM_M

XMTS_M

BSRE_M

BJTN_M

FCNM_M

HBYH_MYTWH_M

SHIA_IOFFS_E

SNCH_I

IMPZ_E

SXLH_E

YQCI_ESXXS_E

YZCM_E

SHDT_E

HNSH_MTYCG_E

GZPJ_E

JZER_ESXCI_ISCHD_M

YNCP_M

TLNM_MJCPR_M

SZLN_M

YTIN_M

SDNS_MYNAL_M

HNZF_M

XXDI_MCSSC_ICIMC_I

CHPC_EIMBT_M

WHIS_M

BSIS_MAGST_M

GJHX_M

SXTG_MHBIS_M

SZBK_F

CMSB_F

CMBK_FSHPD_F

HXBK_FCHRW_I

SHZH_I

SWSC_F

HTSC_F

CJSC_FHYSC_F

GYSC_FNESC_F

SNLS_F

CTSC_FJLAD_H

LNCD_I

CHBA_F

TYHI_IJLYT_M

XCMG_I

SHIT_IZLHS_I

GXLG_I

STCM_I

TSJD_MAHCC_M

CATL_I

SHCG_I

CGZB_ISHCT_F

SHZJ_FXMCD_I

XHZB_F

JSZN_I

FCSH_FCHVK_F

OWRG_F

GMDL_FCHMT_F

BJCT_F

SNUS_F

BEJU_F HNAL_I

CSAL_ICHEA_I LXVC_I

NRTE_I

TBEA_IDFET_I

BDTW_I

CSGH_M

ZJJH_M XAAI_IJXHD_I

GHEG_FCHSS_I

0.1

Figure 21: SplitsTree network for 126 stocks for period three showing the sixidentified clusters. The colours are Cluster 1 – Black, Cluster 2 – Blue, Cluster3 – Pink, Cluster 4 – Red , Cluster 5 – Khaki, and Cluster 6 – Green.

43

Page 46: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAI_I

JXHD_I

RSNM_MCCFM_M

MMTL_I

XMTS_M

BSRE_M

ADTM_M

BJTN_MFCNM_M

HBYH_MXXDI_M

HNZF_MYNAL_M

SDNS_M

YTIN_M

SZLN_M

JCPR_M

TLNM_M

YNCP_M

SCHD_M

YQCI_E

YZCM_E

SHDT_ESXLH_E

HNSH_MTYCG_E

SXXS_EGZPJ_E

JZER_E

SXCI_I

IMPZ_E

SNCH_I

SHIA_I

LNCD_IJLAD_H

CTSC_FSNLS_F

NESC_F

GYSC_FHYSC_F

CJSC_F

HTSC_F

SWSC_F

SHZH_ISZBK_F

CMSB_FCMBK_F

SHPD_FHXBK_F

CHRW_IHBIS_M

SXTG_M

GJHX_M

AGST_M

BSIS_M

WHIS_MIMBT_MCHPC_E

CIMC_ICSSC_I

OFFS_EBEJU_FBJCT_F

CHMT_FGMDL_F

OWRG_F

CHVK_FFCSH_FSNUS_F

JSZN_I

XHZB_F XMCD_ISHZJ_F

SHCT_F

CGZB_ISHCG_I

XCMG_I

GXLG_ISHIT_I

ZLHS_ISTCM_I

TSJD_M

AHCC_M

CATL_I

JLYT_M

CHBA_F

TYHI_I

ZJJH_M

BDTW_I

CSGH_MYTWH_M

HNAL_I

CSAL_I

CHEA_I

DFET_I

TBEA_I

NRTE_I

LXVC_I

ZJNH_HZJMC_H

GXWZ_H

SHFS_H

BJSL_HHRBP_H

TSLP_H

YNBY_HJSHM_H

CRSJ_H

CRDP_H

KMPH_H

HLBE_H

ZJHS_H

SDDE_H

BJTR_HNCPH_H

HYPC_MQHSL_M

GFSC_FCAAE_I

PGGS_M

YYTH_MXAAE_I

GHEG_FCHSS_I

0.1

Figure 22: SplitsTree network for 126 stocks for period four showing the periodthree cluster 1 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

44

Page 47: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAI_I

JXHD_I

RSNM_MCCFM_M

MMTL_IXMTS_M

BSRE_M

ADTM_MBJTN_M

FCNM_MHBYH_M

XXDI_M

HNZF_MYNAL_M

SDNS_M

YTIN_M

SZLN_MJCPR_M

TLNM_M

YNCP_M

SCHD_M

YQCI_EYZCM_E

SHDT_ESXLH_E

HNSH_M

TYCG_E

SXXS_E

GZPJ_EJZER_E

SXCI_I

IMPZ_E

SNCH_I

SHIA_I

LNCD_I

JLAD_H

CTSC_F

SNLS_FNESC_F

GYSC_FHYSC_F

CJSC_F

HTSC_F

SWSC_F

SHZH_I

SZBK_FCMSB_F

CMBK_FSHPD_F

HXBK_F

CHRW_IHBIS_M

SXTG_MGJHX_M

AGST_MBSIS_M

WHIS_MIMBT_MCHPC_E

CIMC_ICSSC_I

OFFS_E

BEJU_FBJCT_F

CHMT_F

GMDL_FOWRG_FCHVK_FFCSH_FSNUS_F

JSZN_I

XHZB_F XMCD_ISHZJ_F

SHCT_F

CGZB_ISHCG_I

XCMG_I

GXLG_I

SHIT_I

ZLHS_ISTCM_I

TSJD_M

AHCC_M

CATL_I

JLYT_M

CHBA_F

TYHI_I

ZJJH_M

BDTW_I

CSGH_MYTWH_M

HNAL_ICSAL_I

CHEA_I

DFET_I

TBEA_I

NRTE_I

LXVC_I

ZJNH_HZJMC_H

GXWZ_H

SHFS_H

BJSL_H

HRBP_H

TSLP_H

YNBY_HJSHM_H

CRSJ_H

CRDP_H

KMPH_H

HLBE_H

ZJHS_H

SDDE_HBJTR_H

NCPH_HHYPC_M

QHSL_M

GFSC_FCAAE_I

PGGS_M

YYTH_MXAAE_I

GHEG_FCHSS_I

0.1

Figure 23: SplitsTree network for 126 stocks for period four showing the periodthree cluster 2 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

45

Page 48: An Application of Correlation Clustering to Portfolio Diversification

HJTG_E

XAAI_IJXHD_I

RSNM_MCCFM_M

MMTL_IXMTS_M

BSRE_M

ADTM_MBJTN_M

FCNM_MHBYH_M

XXDI_M

HNZF_MYNAL_M

SDNS_M

YTIN_M

SZLN_M

JCPR_M

TLNM_M

YNCP_M

SCHD_M

YQCI_E

YZCM_E

SHDT_ESXLH_E

HNSH_MTYCG_E

SXXS_EGZPJ_E

JZER_E

SXCI_I

IMPZ_E

SNCH_I

SHIA_I

LNCD_IJLAD_H

CTSC_FSNLS_F

NESC_F

GYSC_FHYSC_F

CJSC_F

HTSC_F

SWSC_F

SHZH_ISZBK_F

CMSB_FCMBK_F

SHPD_FHXBK_F

CHRW_IHBIS_M

SXTG_M

GJHX_M

AGST_M

BSIS_M

WHIS_MIMBT_MCHPC_E

CIMC_ICSSC_I

OFFS_E

BEJU_FBJCT_F

CHMT_FGMDL_F

OWRG_F

CHVK_FFCSH_FSNUS_F

JSZN_I

XHZB_F XMCD_ISHZJ_F

SHCT_F

CGZB_ISHCG_I

XCMG_I

GXLG_ISHIT_I

ZLHS_ISTCM_I

TSJD_M

AHCC_M

CATL_I

JLYT_M

CHBA_F

TYHI_I

ZJJH_M

BDTW_ICSGH_M

YTWH_M

HNAL_I

CSAL_I

CHEA_IDFET_I

TBEA_I

NRTE_I

LXVC_I

ZJNH_HZJMC_H

GXWZ_H

SHFS_H

BJSL_HHRBP_H

TSLP_H

YNBY_HJSHM_H

CRSJ_H

CRDP_H

KMPH_H

HLBE_H

ZJHS_H

SDDE_H

BJTR_HNCPH_H

HYPC_MQHSL_M

GFSC_FCAAE_I

PGGS_M

YYTH_MXAAE_I

GHEG_FCHSS_I

0.1

Figure 24: SplitsTree network for 126 stocks for period four showing the periodthree cluster 3 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

46

Page 49: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAI_I

JXHD_I

RSNM_MCCFM_M

MMTL_IXMTS_M

BSRE_M

ADTM_MBJTN_M

FCNM_MHBYH_M

XXDI_M

HNZF_M

YNAL_MSDNS_M

YTIN_MSZLN_M

JCPR_MTLNM_M

YNCP_M

SCHD_M

YQCI_E

YZCM_ESHDT_E

SXLH_E

HNSH_M

TYCG_E

SXXS_EGZPJ_E

JZER_E

SXCI_I

IMPZ_ESNCH_I

SHIA_I

LNCD_I

JLAD_H

CTSC_F

SNLS_FNESC_F

GYSC_F

HYSC_FCJSC_F

HTSC_F

SWSC_F

SHZH_I

SZBK_FCMSB_F

CMBK_FSHPD_F

HXBK_F

CHRW_IHBIS_M

SXTG_MGJHX_M

AGST_M

BSIS_M

WHIS_MIMBT_M

CHPC_E

CIMC_ICSSC_I

OFFS_EBEJU_F

BJCT_F

CHMT_FGMDL_F

OWRG_F

CHVK_FFCSH_F

SNUS_F

JSZN_I

XHZB_FXMCD_I

SHZJ_FSHCT_F

CGZB_I

SHCG_IXCMG_I GXLG_I

SHIT_I

ZLHS_ISTCM_I

TSJD_M

AHCC_M

CATL_I

JLYT_M

CHBA_F

TYHI_I

ZJJH_M

BDTW_I

CSGH_MYTWH_M

HNAL_ICSAL_I

CHEA_I

DFET_I

TBEA_I

NRTE_I

LXVC_I

ZJNH_HZJMC_H

GXWZ_H

SHFS_H

BJSL_HHRBP_H

TSLP_H

YNBY_HJSHM_H

CRSJ_H

CRDP_H

KMPH_H

HLBE_H

ZJHS_H

SDDE_H

BJTR_HNCPH_H

HYPC_MQHSL_M

GFSC_FCAAE_I

PGGS_M

YYTH_MXAAE_I

GHEG_FCHSS_I

0.1

Figure 25: SplitsTree network for 126 stocks for period four showing the periodthree cluster 4 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

47

Page 50: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAI_I

JXHD_I

RSNM_MCCFM_M

MMTL_IXMTS_M

BSRE_M

ADTM_MBJTN_M

FCNM_MHBYH_M

XXDI_M

HNZF_MYNAL_M

SDNS_M

YTIN_M

SZLN_M

JCPR_M

TLNM_M

YNCP_M

SCHD_M

YQCI_E

YZCM_E

SHDT_ESXLH_E

HNSH_MTYCG_E

SXXS_EGZPJ_E

JZER_E

SXCI_I

IMPZ_E

SNCH_I

SHIA_I

LNCD_IJLAD_H

CTSC_F

SNLS_FNESC_F

GYSC_FHYSC_F

CJSC_F

HTSC_F

SWSC_FSHZH_I

SZBK_F

CMSB_FCMBK_F

SHPD_FHXBK_F

CHRW_IHBIS_M

SXTG_M

GJHX_M

AGST_M

BSIS_M

WHIS_MIMBT_M

CHPC_E

CIMC_ICSSC_I

OFFS_EBEJU_F

BJCT_F

CHMT_FGMDL_F

OWRG_F

CHVK_FFCSH_FSNUS_F

JSZN_I

XHZB_F XMCD_ISHZJ_F

SHCT_F

CGZB_ISHCG_I

XCMG_I

GXLG_ISHIT_I

ZLHS_ISTCM_I

TSJD_M

AHCC_M

CATL_I

JLYT_M

CHBA_F

TYHI_IZJJH_M

BDTW_I

CSGH_MYTWH_M

HNAL_I

CSAL_I

CHEA_I

DFET_I

TBEA_I

NRTE_I

LXVC_I

ZJNH_HZJMC_H

GXWZ_H

SHFS_H

BJSL_HHRBP_H

TSLP_H

YNBY_HJSHM_H

CRSJ_H

CRDP_H

KMPH_H

HLBE_H

ZJHS_H

SDDE_H

BJTR_HNCPH_H

HYPC_MQHSL_M

GFSC_FCAAE_I

PGGS_M

YYTH_MXAAE_I

GHEG_FCHSS_I

0.1

Figure 26: SplitsTree network for 126 stocks for period four showing the periodthree cluster 5 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

48

Page 51: An Application of Correlation Clustering to Portfolio Diversification

HJTG_EXAAI_I

JXHD_I

RSNM_MCCFM_M

MMTL_IXMTS_M

BSRE_M

ADTM_MBJTN_M

FCNM_M

HBYH_M

XXDI_M

HNZF_MYNAL_M

SDNS_M

YTIN_M

SZLN_MJCPR_M

TLNM_M

YNCP_M

SCHD_MYQCI_E

YZCM_E

SHDT_E

SXLH_E

HNSH_M

TYCG_E

SXXS_E

GZPJ_EJZER_E

SXCI_I

IMPZ_ESNCH_I

SHIA_I

LNCD_I

JLAD_H

CTSC_F

SNLS_FNESC_F

GYSC_FHYSC_F

CJSC_F

HTSC_F

SWSC_F

SHZH_I

SZBK_FCMSB_F

CMBK_FSHPD_F

HXBK_F

CHRW_IHBIS_M

SXTG_MGJHX_M

AGST_MBSIS_M

WHIS_MIMBT_MCHPC_E

CIMC_ICSSC_I

OFFS_E

BEJU_FBJCT_F

CHMT_F

GMDL_FOWRG_FCHVK_FFCSH_FSNUS_F

JSZN_I

XHZB_F XMCD_ISHZJ_F

SHCT_F

CGZB_ISHCG_I

XCMG_I

GXLG_I

SHIT_I

ZLHS_ISTCM_I

TSJD_M

AHCC_M

CATL_I

JLYT_M

CHBA_F

TYHI_I

ZJJH_M

BDTW_I

CSGH_MYTWH_M

HNAL_ICSAL_I

CHEA_I

DFET_I

TBEA_I

NRTE_I

LXVC_I

ZJNH_HZJMC_H

GXWZ_H

SHFS_H

BJSL_HHRBP_H

TSLP_H

YNBY_HJSHM_H

CRSJ_H

CRDP_H

KMPH_H

HLBE_H

ZJHS_H

SDDE_H

BJTR_HNCPH_H

HYPC_MQHSL_M

GFSC_FCAAE_I

PGGS_M

YYTH_MXAAE_I

GHEG_FCHSS_I

0.1

Figure 27: SplitsTree network for 126 stocks for period four showing the periodthree cluster 6 colour coded by industry group. The colours are Energy - Black,Finance – Blue, Health Care – Red, Industrials – Khaki, Materials – Green.

49