20
Community Detection via Maximization of Modularity and Its Variants Mingming Chen, Konstantin Kuzmin, Graduate Student Member, IEEE, and Boleslaw K. Szymanski, Fellow, IEEE (Invited Paper) AbstractIn this paper, we rst discuss the denition of modu- larity ( ) used as a metric for community quality and then we review the modularity maximization approaches which were used for community detection in the last decade. Then, we discuss two opposite yet coexisting problems of modularity optimization; in some cases, it tends to favor small communities over large ones while in others, large communities over small ones (so-called the resolu- tion limit problem). Next, we overview several community quality metrics proposed to solve the resolution limit problem and discuss Modularity Density ( ), which simultaneously avoids the two problems of modularity. Finally, we introduce two novel ne-tuned community detection algorithms that iteratively attempt to improve the community quality measurements by splitting and merging the given network community structure. The rst one, referred to as Fine-tuned , is based on modularity ( ), while the second is based on Modularity Density ( ) and denoted as Fine-tuned . Then, we compare the greedy algorithm of modularity maximization (denoted as Greedy ), Fine-tuned , and Fine-tuned on four real networks, and also on the classical clique network and the LFR benchmark networks, each of which is instantiated by a wide range of parameters. The results indicate that Fine-tuned is the most effective among the three algorithms discussed. Moreover, we show that Fine-tuned can be applied to the communities detected by other algorithms to signicantly improve their results. Index TermsCommunity detection, ne-tuned, maximization, modularity. I. INTRODUCTION M ANY networks, including Internet, citation networks, transportation networks, email networks, and social and biochemical networks, display community structure which con- sists of groups of nodes within which connections are denser than between them [1]. Detecting and characterizing such community structure, which is known as community detection, is one of the fundamental issues in the study of network systems. Community detection has been shown to reveal latent yet meaningful struc- ture in networks such as groups in online and contact-based social networks, functional modules in proteinprotein interac- tion networks, groups of customers with similar interests in online retailer user networks, groups of scientists in interdisci- plinary collaboration networks, etc. [2]. In the last decade, the most popular community detection methods have been to maximize the quality metric known as modularity [1], [3][5] over all possible partitions of a network. Such modularity optimization algorithms include greedy algo- rithms [6][9], spectral methods [3], [10][15], extremal optimi- zation (EO) [16], simulated annealing (SA) [17][20], sampling technique [21], and mathematical programming [22]. Modularity measures the difference between the actual fraction of edges within the community and such fraction expected in a randomized graph with the same number of nodes and the same degree sequence. It is widely used as a measurement of strength of the community structures detected by the community detection algo- rithms. However, modularity maximization has two opposite yet coexisting problems. In some cases, it tends to split large com- munities into two or more small communities [23], [24]. In other cases, it tends to form large communities by merging communities that are smaller than a certain threshold which depends on the total number of edges in the network and on the degree of inter- connectivity between the communities. The latter problem is also known as the resolution limit problem [23][25]. To solve these two issues of modularity, several community quality metrics were introduced, including Modularity Density [23], [24] which simultaneously avoids both of them. We then propose two novel ne-tuned community detection algo- rithms that repeatedly attempt to improve the quality measure- ments by splitting and merging the given community structure. We denote the corresponding algorithm based on modularity ( ) as Fine-tuned , while the one based on Modularity Density ( ) is referred to as Fine-tuned . Finally, we evaluate the greedy algorithm of modularity maximization (denoted as Gree- dy ), Fine-tuned , and Fine-tuned by using seven community quality metrics based on ground truth communities. These evaluations are conducted on four real networks, and also on the classical clique network and the LFR benchmark net- works, each of which is instantiated by a wide range of para- meters. The results indicate that Fine-tuned is the most effective method and can also dramatically improve the commu- nity detection results of other algorithms. Further, all seven quality measurements based on ground truth communities are consistent with , but not consistent with , which implies the superiority of Modularity Density over the original modularity. II. REVIEW OF MODULARITY RELATED LITERATURE In this section, we rst review the denition of modularity and the corresponding optimization approaches. Then, we discuss Manuscript received December 4, 2013; accepted January 29, 2014. Date of publication April 09, 2014; date of current version June 02, 2014. This work was supported in part by the Army Research Laboratory under Cooperative Agreement W911NF-09-2-0053 and in part by the the Ofce of Naval Research Grant N00014-09-1-0607. The authors are with Rennselaer Polytechnic Institute, Troy, NY 12180 USA (e-amil: [email protected]; [email protected]; [email protected]). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TCSS.2014.2307458 46 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014 2329-924X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Community Detection via Maximization of Modularity and Its Variants

Embed Size (px)

Citation preview

Community Detection via Maximization ofModularity and Its Variants

Mingming Chen, Konstantin Kuzmin, Graduate Student Member, IEEE, and Boleslaw K. Szymanski, Fellow, IEEE

(Invited Paper)

Abstract—In this paper, we first discuss the definition of modu-larity ( ) used as a metric for community quality and then wereview the modularity maximization approaches which were usedfor community detection in the last decade. Then, we discuss twoopposite yet coexisting problems of modularity optimization; insome cases, it tends to favor small communities over large oneswhilein others, large communities over small ones (so-called the resolu-tion limit problem). Next, we overview several community qualitymetrics proposed to solve the resolution limit problem and discussModularity Density ( ), which simultaneously avoids the twoproblems of modularity. Finally, we introduce two novel fine-tunedcommunity detection algorithms that iteratively attempt to improvethe community quality measurements by splitting and merging thegiven network community structure. The first one, referred to asFine-tuned , is based onmodularity ( ), while the second is basedonModularity Density ( ) and denoted as Fine-tuned . Then,we compare the greedy algorithm of modularity maximization(denoted as Greedy ), Fine-tuned , and Fine-tuned on fourreal networks, and also on the classical clique network and the LFRbenchmark networks, each of which is instantiated by a wide rangeof parameters. The results indicate that Fine-tuned is the mosteffective among the three algorithms discussed. Moreover, we showthat Fine-tuned can be applied to the communities detected byother algorithms to significantly improve their results.

Index Terms—Community detection, fine-tuned, maximization,modularity.

I. INTRODUCTION

M ANY networks, including Internet, citation networks,transportation networks, email networks, and social and

biochemical networks, display community structure which con-sists of groups of nodeswithinwhich connections are denser thanbetween them [1]. Detecting and characterizing such communitystructure, which is known as community detection, is one of thefundamental issues in the study of network systems. Communitydetection has been shown to reveal latent yet meaningful struc-ture in networks such as groups in online and contact-basedsocial networks, functional modules in protein–protein interac-tion networks, groups of customers with similar interests inonline retailer user networks, groups of scientists in interdisci-plinary collaboration networks, etc. [2].

In the last decade, the most popular community detectionmethods have been to maximize the quality metric known asmodularity [1], [3]–[5] over all possible partitions of a network.Such modularity optimization algorithms include greedy algo-rithms [6]–[9], spectral methods [3], [10]–[15], extremal optimi-zation (EO) [16], simulated annealing (SA) [17]–[20], samplingtechnique [21], andmathematical programming [22].Modularitymeasures the difference between the actual fraction of edgeswithin the community and such fraction expected ina randomizedgraph with the same number of nodes and the same degreesequence. It is widely used as a measurement of strength of thecommunity structures detected by the community detection algo-rithms. However, modularity maximization has two opposite yetcoexisting problems. In some cases, it tends to split large com-munities into two or more small communities [23], [24]. In othercases, it tends to formlargecommunitiesbymergingcommunitiesthat are smaller than acertain thresholdwhichdependson the totalnumber of edges in the network and on the degree of inter-connectivity between the communities. The latter problem is alsoknown as the resolution limit problem [23]–[25].

To solve these two issues of modularity, several communityquality metrics were introduced, including Modularity Density

[23], [24] which simultaneously avoids both of them. Wethen propose two novel fine-tuned community detection algo-rithms that repeatedly attempt to improve the quality measure-ments by splitting and merging the given community structure.We denote the corresponding algorithm based onmodularity ( )as Fine-tuned , while the one based on Modularity Density( ) is referred to as Fine-tuned . Finally, we evaluate thegreedy algorithm of modularity maximization (denoted asGree-dy ), Fine-tuned , and Fine-tuned by using sevencommunity quality metrics based on ground truth communities.These evaluations are conducted on four real networks, and alsoon the classical clique network and the LFR benchmark net-works, each of which is instantiated by a wide range of para-meters. The results indicate that Fine-tuned is the mosteffective method and can also dramatically improve the commu-nity detection results of other algorithms. Further, all sevenquality measurements based on ground truth communities areconsistent with , but not consistent with , which implies thesuperiority of Modularity Density over the original modularity.

II. REVIEW OF MODULARITY RELATED LITERATURE

In this section, we first review the definition of modularity andthe corresponding optimization approaches. Then, we discuss

Manuscript received December 4, 2013; accepted January 29, 2014. Date ofpublication April 09, 2014; date of current version June 02, 2014. This workwas supported in part by the Army Research Laboratory under CooperativeAgreement W911NF-09-2-0053 and in part by the the Office of Naval ResearchGrant N00014-09-1-0607.

The authors are with Rennselaer Polytechnic Institute, Troy, NY 12180 USA(e-amil: [email protected]; [email protected]; [email protected]).

Color versions of one ormore of the figures in this paper are available online athttp://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSS.2014.2307458

46 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

2329-924X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

the two opposite yet coexisting problems of modularity maxi-mization. Finally, we overview several community quality mea-surements proposed to solve the resolution limit problemand then discuss Modularity Density ( ) [23], [24] whichsimultaneously avoids these two problems.

A. Definition of Modularity

Comparing results of different network partitioning algorithmscanbechallenging, especiallywhennetwork structure is notknownbeforehand. A concept of modularity defined in [1] provides ameasure of the quality of a particular partitioning of a network.Modularity ( ) quantifies the community strength by comparingthe fractionof edgeswithin thecommunitywith such fractionwhenrandom connections between the nodes are made. The justificationis that a community should have more links between themselvesthan a random gathering of people. Thus, the value close to 0means that the fraction of edges inside communities is no betterthan the random case, and the value of 1 means that a networkcommunity structure has the highest possible strength.

Formally, modularity ( ) can be defined as [1]

where is the set of all the communities, is a specificcommunity in , is the number of edges between nodeswithin community , is the number of edges from thenodes in community to the nodes outside , and is the totalnumber of edges in the network.

Modularity can also be expressed in the following form [3]:

where is the degree of node , is an element of theadjacency matrix, is the Kronecker delta symbol, andis the label of the community to which node is assigned.

Since larger means a stronger community structure, severalalgorithms, which we will discuss in Section II-B, are based onmodularity optimization.

The modularity measure defined above is suitable only forundirected and unweighted networks. However, this definitioncan be naturally extended to apply to directed networks aswell asto weighted networks. Weighted and directed networks containmore information than undirected and unweighted ones. Whilethis additional information makes such networks more valuable,it also makes them more difficult to analyze.

The revised definition of modularity that works for directednetworks is as follows [4]:

where and are the in- and out-degrees.Although many networks can be regarded as binary, i.e., as

either having an edge between a pair of nodes or not having it,

there are many other networks for which it would be natural totreat edges as having a certain degree of strength or weight.

The same general techniques that have been developed forunweighted networks are applied to itsweightedcounterparts in [5]by mapping weighted networks onto multigraphs. For non-negative integer weights, an edge with weight in a weightedgraph corresponds to parallel edges in a corresponding multi-graph. Although negative weights can arise in some applications,they are rarely useful in social networks, so for the sake of brevitywe will not discuss them here. It turns out that an adjacencymatrixof a weighted graph is equivalent to that of a multigraph withunweighted edges. Since the structure of adjacency matrix isindependent of the edge weights, it is possible to adjust all themethodsdeveloped for unweightednetworks to theweightedones.

It is necessary to point out that the notion of degree of a nodeshould also be extended for the weighted graphs. In this case,degree of a node is defined as the sum of weights of all edgesincident to this node.

It is shown in [5] that the same definitions of modularity thatwere given above hold for the weighted networks as well, if wetreat as the value that represents weight of the connection andset .

B. Modularity Optimization Approaches

In the literature, a high value of modularity ( ) indicates agood community structure and the partition corresponding to themaximum value of modularity on a given graph is supposed tohave the highest quality, or at least a very good one. Therefore, itis natural to discover communities by maximizing modularityover all possible partitions of a network. However, it is compu-tationally prohibitively expensive to exhaustively search all suchpartitions for the optimal value of modularity since modularityoptimization is known to be NP-hard (i.e., it belongs to non-deterministic polynomial-time hard class of problems) [26].However, many heuristic methods were introduced to findhigh-modularity partitions in a reasonable time. Those ap-proaches include greedy algorithms [6]–[9], spectral methods[3], [10]–[15], EO [16], SA [17]–[20], sampling technique [21],and mathematical programming [22]. In this section, we willreview those modularity optimization heuristics.

1)Greedy Algorithms:Thefirst greedy algorithmwas proposedby Newman [6]. It is a agglomerative hierarchical clusteringmethod. Initially, every node belongs to its own community,creating altogether communities. Then, at each step, thealgorithm repeatedly merges pairs of communities together andchooses themerger forwhich the resultingmodularity is the largest.The change in upon joining two communities and is

where is the number of edges from community tocommunity and is the total degreesof nodes in community . can be calculated in constanttime. The algorithm stopswhen all the nodes in the network are in asingle community after steps ofmerging. Then, there aretotally partitions, the first one defined by the initial step andeach subsequent one resulting from each of the subsequent

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 47

merging steps. The partition with the largest value ofmodularity, approximating the modularity maximum best, is theresult of the algorithm.At eachmergingstep, the algorithmneeds tocompute the change of modularity resulting from joiningany two currently existing communities and in order to choosethe bestmerger. Sincemerging twodisconnected communitieswillnot increase the value of modularity, the algorithm checks only themerging of connected pairs of communities and thenumber of suchpairs is at most limiting the complexity of this part to .However, the rows and columns of adjacent matrix correspondingto the two merged communities must be updated, which takes

. Since there are iterations, thefinal complexity ofthe algorithm is , or for sparsenetworks.

Although Newman’s algorithm [6] is much faster than thealgorithm of Newman and Girvan [1] whose complexity is

, Clauset et al. [7] pointed out that the update of theadjacent matrix at each step contains a large number of unnec-essary operations when the network is sparse and therefore itsmatrix has a lot of zero entries. They introduced data structuresfor sparse matrices to perform the updating operation moreefficiently. In their algorithm, instead ofmaintaining the adjacentmatrix and computing , they maintained and updated thematrix with entries being for the pairs of connectedcommunities and . The authors introduced three data struc-tures to represent sparse matrices efficiently: 1) each row of thematrix is stored as a balanced binary tree in order to search andinsert elements in time and also as amax-heap so as tolocate the largest element of each row in constant time; 2) anothermax-heap stores the largest element of each row of the matrix soas to locate the largest in constant time; 3) a vector is usedto save for each community . Then, in each step, thelargest can be found in constant time and the update of theadjacent matrix after merging two communities and takes

, where and are the numbers ofneighboring communities of communities and , respectively.Thus, the total running time is at most times the sumof the degrees of nodes in the communities along the dendrogramcreated by merging steps. This sum is in the worst case the depthof the dendrogram times the sum of the degrees of nodes in thenetwork. Suppose the dendrogram has depth , then the runningtime is or , when the network issparse and the dendrogram is almost balanced ( ).

However, Wakita and Tsurumi [8] observed that the greedyalgorithmproposedbyClauset et al. is not scalable tonetworkswithsizes larger than 500 000 nodes. They found that the computationalinefficiency arises from merging communities in an unbalancedmanner, which yields very unbalanced dendrograms. In such cases,the relation does not hold any more, causing thealgorithm to run at its worst-case complexity. To balance themerging of communities, the authors introduced three types ofconsolidation ratios tomeasure the balance of the community pairsand used it with modularity to perform the joining process ofcommunities without bias. This modification enables the algorithmto scale to networks with sizes up to 10 000 000. It also approx-imates the modularity maximum better than the original algorithm.

Another type of greedy modularity optimization algorithmdifferent from those above was proposed by Blondel et al., and it

is usually referred to as Louvain [9]. It is divided into two phasesthat are repeated iteratively. Initially, every node belongs to thecommunity of itself, so there are communities. In this firstphase, every node, in a certain order, is considered for merginginto its neighboring communities and the merger with the largestpositive gain is selected. If all possible gains associated with themerging of this node are negative, then it stays in its originalcommunity. Thismerging procedure repeats iteratively and stopswhen no increase of can be achieved.

After the first phase, Louvain reaches a local maximum of .Then, the second phase of Louvain builds a community networkbased on the communities discovered in the first phase. Thenodes in the new network are the communities from the firstphase and there is a edge between two new nodes if there areedges between nodes in the corresponding two communities. Theweights of those edges are the sum of the weights of the edgesbetween nodes in the corresponding two communities. The edgesbetween nodes of the same community of the first phase result ina self-loop for this community node in the new network. After thecommunity network is generated, the algorithm applies the firstphase again on this new network. The two phases repeat itera-tively and stop when there is no more change and consequently amaximum modularity is obtained. The number of iterations ofthis algorithm is usually very small and most of computationaltime is spent in the first iteration. Thus, the complexity of thealgorithm grows like . Consequently, it is scalable to largenetworks with the number of nodes up to a billion. However, theresults of Louvain are impacted by the order inwhich the nodes inthe first phase are considered for merging [27].

2) Spectral Methods: There are two categories of spectralalgorithms for maximizing modularity: 1) based on themodularity matrix [3], [10], [11]; and 2) based on theLaplacian matrix of a network [12]–[14].

a) Modularity optimization using the eigenvalues andeigenvectors of the modularity matrix [3], [10], [11]:Modularity ( ) can be expressed as [3]

where are the elements of adjacentmatrix and is the columnvector representing any division of the network into two groups. Itselements are defined as if node belongs to the first groupand if it belongs to the second group. is the modularitymatrix with elements

Representing as a linear combination of the normalizedeigenvectors of : with , andthen plugging the result into (5) yields

where is the eigenvalue of corresponding to eigenvector .Tomaximize above, Newman [3] proposed a spectral approach

48 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

to choose proportional to the leading eigenvectorcorresponding to the largest (most positive) eigenvalue . Thechoice assumes that the eigenvalues are labeled in decreasingorder . Nodes are then divided into twocommunities according to the signs of the elements in with nodescorresponding to positive elements in assigned to one group andall remaining nodes to another. Since the row and column sums of

are zero, it always has an eigenvector witheigenvalue zero. Therefore, if it has no positive eigenvalue,then the leading eigenvector is , which means thatthe network is indivisible. Moreover, Newman [3] proposed todivide network into more than two communities by repeatedlydividing each of the communities obtained so far into two until theadditional contribution to the modularity made by thesubdivision of a community

is equal to or less than 0. in the formula above is thegeneralized modularity matrix. Its elements, indexed by thelabels and of nodes within community , are

Then, the same spectral method can be applied to tomaximize . The recursive subdivision process stops when

, which means that there is no positive eigenvalue of thematrix . The overall complexity of this algorithm is

.However, the spectral algorithm described above has two

drawbacks. First, it divides a network into more than two com-munities by repeated division instead of getting all the communi-ties directly in a single step. Second, it only uses the leadingeigenvector of the modularity matrix and ignores all the others,losing all the useful information contained in those eigenvectors.Newman later proposed to divide a network into a set of commu-nities with directly usingmultiple leading eigenvectors[10]. Let be an “community-assignment”matrix with one column for each community defined as

then the modularity ( ) for this direct division of the network isgiven by

where is the trace of matrix . Defining, where is the matrix of eigen-

vectors of and is the diagonalmatrix of eigenvalues ,yields

Then, obtaining communities is equivalent to selectingindependent,mutually orthogonal columns .Moreover,

would be maximized by choosing the columns proportionalto the leading eigenvectors of . However, only the eigenvectorscorresponding to the positive eigenvalues will contribute posi-tively to themodularity. Thus, the number of positive eigenvalues,plus 1, is the upper bound of . More general modularitymaximization is to keep the leading eigenvectors.

can be rewritten as

where is a constant related to the approximation forobtained by only adopting the first leading eigenvectors.By selecting node vectors of dimension whose thcomponent is

modularity can be approximated as

where , , are the community vectors

Thus, the community detection problem is equivalent to choos-ing such a division of nodes into groups that maximizes themagnitudes of the community vectors while requiring that

> if node is assigned to community . Problems of thistype are called vector partitioning problems.

Although [10] explored usingmultiple leading eigenvectors ofthe modularity matrix, it did not pursue it in detail beyond a two-eigenvector approach for bipartitioning [3], [10]. Richardsonet al. [11] provided a extension of these recursive bipartitioningmethods by considering the best two-way or three-way divisionat each recursive step to more thoroughly explore the promisingpartitions. To reduce the number of partitions considered for theeigenvector-pair tripartitioning, the authors adopted a divide-and-conquer method and as a result yielded an efficient approachwhose computational complexity is competitive with the two-eigenvector bipartitioning method.

b) Modularity optimization using the eigenvalues andeigenvectors of the Laplacian matrix [12]–[14]: Given apartition (a set of communities) and the corresponding“community-assignment” matrix , White and Smyth[12] rewrote modularity ( ) as follows:

where and the elements of are . Thematrix is called the “Q-Laplacian.” Finding the“community-assignment” matrix that maximizes above is

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 49

NP-complete, but a good approximation can be obtained byrelaxing the discreteness constraints of the elements of andallowing them to assume real values. Then, becomes acontinuous function of and its extremes can be found byequating its first derivative with respect to to zero. This leadsto the eigenvalue equation

where� is the diagonal matrix of Lagrangian multipliers. Thus,the modularity optimization problem is transformed into thestandard spectral graph partitioning problem. When the networkis not too small, can be approximated well, up to constantfactors, by the transition matrix obtained bynormalizing so that all rows sum to one. Here, is thediagonal degreematrix of . It can be shown that the eigenvaluesand eigenvectors of are precisely and�, where and�are the solutions to the generalized eigenvalue problem� �, where is the Laplacian matrix.

Thus, the underlying spectral algorithm here is equivalent tothe standard spectral graph partitioning problem which uses theeigenvalues and eigenvectors of the Laplacian matrix.

Based on the above analysis, White and Smyth proposed twoclustering algorithms, named “Algorithm Spectral-1” and “Al-gorithm Spectral-2,” to search for a partition with size up topredefined by an input parameter. Both algorithms take theeigenvector matrix with the leading

eigenvectors (excluding the trivial all-ones eigenvector)of the transition matrix as input. Those eigenvectorscan be efficiently computed with the implicitly restarted Lanczosmethod (IRLM) [28]. “Algorithm Spectral-1” uses the first( ) columns of , denoted as , and clusters therow vectors of using -means to find a -way partition,denoted as . Then, the with size that achieves the largestvalue of is the final community structure.

“Algorithm Spectral-2” starts with a single community ( )and recursively splits each community into two smaller ones if thesubdivision produces a higher value of . The split is done byrunning -meanswith two clusters on thematrix formed from

by keeping only rows corresponding to nodes in . Therecursive procedure stops when no more splits are possible orwhen communities have been found and then the finalcommunity structure with the highest value of is the detectionresult.

However, the two algorithms described above, especially“Algorithm Spectral-1,” scale poorly to large networks becauseof running -means partitioning up to times. Both approacheshave a worst-case complexity . In order tospeed up the calculation while retaining effectiveness in approx-imating the maximum of , Ruan and Zhang [13] proposed theKcut algorithm which recursively partitions the network tooptimize . At each recursive step,Kcut adopts a -way partition( ) to the subnetwork induced by the nodes andedges in each community using “Algorithm Spectral-1” ofWhiteand Smyth [12]. Then, it selects the that achieves the highest .Empirically, Kcut with as small as 3 or 4 can significantlyimprove over the standard bipartitioning method and it alsoreduces the computational cost to for afinal partition with communities.

Ruan and Zhang later [14] proposed QCUT algorithm whosename is derived from modularity ( ) partitioning (CUT), thatcombines Kcut and local search to optimize . The QCUTalgorithm consists of two alternating stages: 1) partitioning and2) refinement. In the partitioning stage, Kcut is used to recur-sively partition the network until cannot be further improved.In the refinement stage, a local search strategy repeatedly considerstwo operations. The first one is migration that moves a node fromits current community to another one and the second one is themergeof two communities into one.Both are applied to improveas much as possible. The partitioning stage and refinement stageare alternating until cannot be increased further. In order to solvethe resolution limit problem of modularity, the authors proposedstading for hierarchical QCUT (HQCUT ) which recursivelyappliesQCUT to divide the subnetwork, generated with the nodesand edges in each community, into subcommunities. Further, toavoid overpartitioning, they use a statistical test to determinewhether a community indeed has intrinsic subcommunities.

c) Equivalence of two categories of spectral algorithms formaximizing modularity [15]: Newman [15] showed that withhyperellipsoid relaxation, the spectral modularity maximizationmethod using the eigenvalues and eigenvectors of themodularitymatrix can be formulated as the spectral algorithm that relies onthe eigenvalues and eigenvectors of Laplacian matrix. Thisformulation indicates that the above two kinds of modularityoptimization approaches are equivalent. Starting with (5) for thedivision of a network into two groups, first the discreteness ofis relaxed onto a hyperellipsoid with the constraint

Then, the relaxed modularity maximization problem can beeasily solved by setting thefirst derivative of (5) with respect toto zero. This leads to

or in matrix notation

where is the eigenvalue. Plugging (20) into (5) yields

Therefore, to achieve the highest value of , one should choseto be the largest (most positive) eigenvalue of (21). Using (6),

(20) can be rewritten as

or in matrix notion as

where is the vector with element and .Then, multiplying (24) by results in . If there

50 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

is a nontrivial eigenvalue > , then the above equationsimplifies to

Again, should be themost positive eigenvalue. However, theeigenvector corresponding to this eigenvalue is the uniformvector which fails to satisfy . Thus, in this case,one can do the best by choosing to be the second largesteigenvalue and having proportional to the correspondingeigenvector. In fact, this eigenvector is precisely equal to theleading eigenvector of (21). Then, after defining a rescaled vector

and plugging it into (25), we get

The matrix is called the normalizedLaplacian matrix. (The normalized Laplacian is sometimesdefined as , but those two differ onlyby a trivial transformation of their eigenvalues and eigenvectors.)

3) Extremal Optimization: Duch and Arenas [16] proposed amodularity optimization algorithm based on the EO [29]. EOoptimizes a global variable by improving extremal localvariables. Here, the global variable is modularity ( ). Thecontribution of an individual node to of the wholenetwork with a certain community structure is given by

where is the number of edges that connect node to the nodesin its own community . Note that and can benormalized into the interval by dividing it by

where , calledfitness, is the relative contribution of node to .Then, the fitness of each node is adopted as the local variable.

The algorithm starts by randomly splitting the network intotwo partitions of equal number of nodes, where communities arethe connected components in each partition. Then, at eachiteration, it moves the node with the lowest fitness from its owncommunity to another community. The shift changes the com-munity structure, so the fitness of many other nodes needs to berecomputed. The process repeats until it cannot increase . Afterthat, it generates sub-community networks by deleting the inter-community edges and proceeds recursively on each sub-com-munity network until cannot be improved. Although theprocedure is deterministic when given the initialization, its finalresult in fact depends on the initialization and it is likely to gettrapped in local maxima. Thus, a probabilistic selection called-EO [29] in which nodes are ranked according to their fitness

and a node of rank is selectedwith the probability isused to improve the result. The computational complexity of thisalgorithm is .

4) Simulated Annealing: SA [30] is a probabilistic procedurefor the global optimization problem of locating a goodapproximation to the global optimum of a given function in alarge search space. This technique was adopted in [17]–[20] tomaximize modularity ( ). The initial point for all those

approaches can be arbitrary partitioning of nodes intocommunities, even including communities in which eachnode belongs to its own community. At each iteration, a nodeand a community are chosen randomly. This community couldbe a currently existing community or an empty communityintroduced to increase the number of communities. Then,node is moved from its original community to this newcommunity , which would change by . If isgreater than zero, this update is accepted, otherwise it isaccepted with probability where in [17]–[19]represents the inverse of temperature and in [20] is thereciprocal of pseudo temperature . In addition in [20], there isone more condition for the move of a node when is not empty,shifting node to is considered only if there are some edgesbetween node and the nodes in . To improve the performanceand to avoid getting trapped in local minima, collectivemovements which involve moving multiple nodes at a time[19], [20], merging two communities [17]–[19], and splittinga community [17]–[19] are employed. Splits can be carried out ina number of different schemes. The best performance is achievedby treating a community as an isolated subnetwork andpartitioning it into two and then performing a nested SA onthese partitions [17], [18]. Those methods stop when no newupdate is accepted within a fixed number of iterations.

5) Sampling Techniques: Sales-Pardo et al. [21] proposed a“box-clustering”method to extract the hierarchical organizationof networks. This approach consists of two steps: 1) estimatingthe similarity, called “node affinity,” between nodes and formingthe node affinity matrix; and 2) deriving hierarchical communitystructure from the affinitymatrix. The affinity between two nodesis the probability that they are classified into the same communityin the local maxima partitions of modularity. The set of localmaxima partitions, called , includes those partitions forwhich neither the moving of a node from its original communityto another, nor the merging of two communities will increase thevalue ofmodularity. The sample is found by performing theSA-based modularity optimization algorithm of Guimerá andAmaral [17], [18]. More specifically, the algorithm firstrandomly divides the nodes into communities and thenperforms the hill-climbing search until a sample with localmaximum of modularity is reached. Then, the affinity matrixis updated based on the obtained sample.

The sample generation procedure is repeated until the affinitymatrix has converged to its asymptotic value. Empirically, thetotal number of samples needed is proportional to the size of thenetwork. Before proceeding to the second step, the algorithmassesses whether the network has a significant communitystructure or not. It is done by computing the -score of theaverage modularity of the partitions in with respect to theaverage modularity of the partitions with the local modularitymaxima of the equivalent ensemble of null model networks. Theequivalent null model is obtained by randomly rewiring theedges of the original network while retaining the degree se-quence. Large -score indicates that the network has a meaning-ful internal community structure. If the network indeed has asignificant community structure, the algorithm advances to thesecond step to group nodes with large affinity close to each other.The goal is to bring the form of the affinity matrix as close as

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 51

possible to block-diagonal structure by minimizing the costfunction representing the average distance of matrix elementsto the diagonal. Then, the communities corresponds to the “best”set of boxes obtained by least-squares fitting of the block-diagonal structure to the affinitymatrix. The procedure describedabove can be recursively performed to subnetworks induced bycommunities to identify the low level structure of each commu-nity until no subnetwork is found to have significant intrinsicstructure.

6) Mathematical Programming: Agarwal and Kempe [22]formulated the modularity maximization problem as a linearprogram and vector program which has the advantage ofproviding a posteriori performance guarantees. First, modularitymaximization can be transformed into the integer program

where is the modularity matrix and the objective function islinear in the variable .When , and belong to the samecommunity and indicates that they are in differentcommunities. The restriction requires that andare in the samecommunity if and only if , , and are in the same

community. Solving the above integer program is NP-hard, butrelaxing the last constraint that is a integer from to allow

be a real number in the interval [0, 1] reduces the integerprogram to a linear program which can be solved in polynomialtime [31]. However, the solution does not correspond to a partitionwhen any of is fractional. To get the communities from , arounding step is needed. The value of is treated as the distancebetween and and these distances are used repeatedly to formcommunities of “nearby” nodes. Moreover, optimizing modu-larity by dividing a network into two communities can beconsidered as a strict quadratic program

where the objective function is the same as (5) defined byNewman[3]. Note that the constraint ensures that whichimplies that node belongs either to the first or the secondcommunity. Quadratic programming is NP-complete, but itcould be relaxed to a vector program by replacing each variablewith -dimensional vector and replacing the scalar product

with the innervector product. The solution to vector program isonelocation per node on the surface of a -dimensional hypersphere.To obtain a bipartition from these node locations, a rounding step isneeded which chooses any random -dimensionalhyperplane passing through the origin and uses this hyperplaneto cut the hypersphere into two halves and as a result, separate thenode vectors into two parts. Multiple random hyperplanes can bechosen and the one that gets the community structure with thehighest modularity provides a solution. The same vector programis then recursively applied to subnetworks generated with nodesand edges in discovered communities to get hierarchical

communities until cannot be increased. Following the linearprogram and vector program, Agarwal and Kempe also adopted apost-processing step similar to the local search strategy proposedby Newman [3] to further improve the results.

C. Resolution Limit

Since its inception, the modularity has been used extensivelyas the measure of the quality of partitions produced by commu-nity detection algorithms. In fact, if we adopt modularity as aquality measure of communities, the task of discovering com-munities is essentially turned into the task of finding the networkpartitioning with an optimal value of modularity.

However as properties of the modularity were studied, it wasdiscovered that in some cases it fails to detect small communities.There is a certain threshold [25], such that a community of thesize below it will not be detected even if it is a complete subgraphconnected to the rest of the graph with a single edge. Thisproperty of modularity has been known as the resolution limit.

Although the resolution limit prevents detection of small com-munities, the actual value of the threshold depends on the totalnumber of edges in the network and on the degree of interconnec-tedness between communities. In fact, the resolution limit can reachthe values comparable to the size of the entire network causingformation of a fewgiant communities (or even a single community)and failing to detect smaller communities within them. It makesinterpreting the results of community detection very difficultbecause it is impossible to tell beforehand whether a communityis well-formed or if it can be further split into subcommunities.

Considering modularity as a function of the total number ofedges and the number of communities makes it possible tofind the values of and whichmaximize this function. It turns

out that setting yields the absolute maximal value ofmodularity. Consequently, modularity has a resolution limit of

order which bounds the number and size of communities[25]. In fact, if for a certain community, the number of edges inside

it is smaller than , such community cannot be resolved

through the modularity optimization. It is also possible for mod-ularity optimization to fail to detect communities of larger size ifthey have more edges in common with the rest of the network.Therefore, by finding the optimal value of the modularity, we aregenerally not obtaining the best possible structure of communities.

The above arguments can also be applied toweightednetworks.In this case, is the sum of the weights of all the edges in thenetwork, is the sum of the weights of the edges betweennodeswithin community , and is the sumof theweights ofthe edges from the nodes in community to the nodes outside .

By introducing an additional parameter , which represents theweight of inter-community edges, Berry et al. showed in [32] thatthe number of communities in the optimal solution is

Correspondingly, any community for which its size

<

may not be resolved.

52 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

Introduction of brings some interesting opportunities. If wecan make arbitrarily small, then we can expect maximumweighted modularity to produce any desired number of commu-nities. In other words, given a proper weighting, a much bettermodularity resolution can be achieved than without weighting.However, in practice, finding a way to set edgeweights to achievesmall values of can be challenging. An algorithm for loweringproposed by Berry et al. requires time.

D. Resolving the Resolution Limit Problem

There have been extensive studies done on how tomitigate theconsequences of the modularity resolution limit. The mainapproaches followed are described below.

Localized modularity measure ( ) [33] is based on theobservation that the resolution limit problem is caused bymodularity being a global measure since it assumes that edgesbetween any pairs of nodes are equally likely, including con-nectivity between the communities. However, inmany networks,the majority of communities have edges to only a few othercommunities, i.e., exhibit a local community connectivity.

Thus, a local version of the modularity measure for a directednetwork is defined as

where is the total number of edges in the neighboringcommunities of , i.e., in the communities towhich all neighborsof belong.

Unlike traditional modularity ( ), the local version of modu-larity (LQ) is not bounded above by 1. The more locallyconnected communities a network has, the bigger its LQ cangrow. In a network, where all communities are connected to eachother, LQ yields the same value as . LQ considers individualcommunities and their neighbors, and therefore provides ameasure of community quality that is not dependent on otherparts of the network. The local connectivity approach can beapplied not only to the nearest neighboring communities, but alsoto the second or higher neighbors as well.

Arenas et al. proposed a multiple resolution method [34]which is based on the idea that it might be possible to look atthe detected community structure at different scales. From thisperspective, themodularity resolution limit is not a problem but afeature. It allows choosing a desired resolution level to achievethe required granularity of the output community structure usingthe original definition of modularity.

The multiple resolution method is based on the definition ofmodularity given by (1). Themodularity resolution limit dependson the total weight . By varying the totalweight, it is possibleto control the resolution limit, effectively performing communitydetection at different granularity levels. Changing the sum ofweights of edges adjacent to every node by some value resultsin rescaling topology by a factor of . Since the resolution limit isproportional to , the growth of the resolution limit is slowerthan that of . Consequently, it would be possible to achieve ascale at which all required communities would be visible to themodularity optimization problem.

Caution should be exercised when altering the weights ofedges in the network to avoid changing its topological char-acteristics. To ensure this, a rescaled adjacency matrix can bedefined as

where is the adjacency matrix and is the identity matrix.Since the original edge weights are not altered, preserves allcommon features of the network: distribution of sum of weights,weighted clustering coefficient, eigenvectors, etc. Essentially,introducing results in a self-loop of weight being added toevery node in the network.

Optimizing the modularity for the rescaled topology isperformed by using the modularity at scale as the new qualityfunction

where is the number of nodes in community and. It yields larger communities for smaller

values of and smaller communities for larger values of . Byperformingmodularity optimization for different values of , it ispossible to analyze the community structure at different scales.

Parameter can also be thought as representing resistance of anode to become part of a community. If is positive, we canobtain a network community structure that is more granular thanwhat was possible to achieve with the original definition ofmodularity ( ) which corresponds to being zero. Makingnegative zooms out of the network and provides a view of supercommunities.

Further studies of themultiple resolution approach revealed thatit suffers from two major issues outlined in [35]. First, when thevalue of the resolution parameter is low it tends to group togethersmall communities. Second, when the resolution is high, it splitslarge communities. These trends are opposite for networks with alarge variation of community sizes.Hence, it is impossible to selecta value of the resolution parameter, such that neither smaller norlarger communities are adversely affected by the resolution limit.Anetwork can be tested for susceptibility to the resolution problemby examining its clustering coefficient, i.e., a degree to whichnodes tend to form communities. If the clustering coefficient hassharp changes, it indicates that communities of substantiallydifferent scales exist in this network. The result is that when thevalue of is sufficiently large, bigger communities get broken upbefore smaller communities are found. This applies also to othermultiple resolution methods and seems to be a general problem ofthe methods that are trying to optimize some global measure.

The hierarchical multiresolution method proposed by Granellet al. in [36] overcomes the limitations of the multiple resolutionmethod on networks with very different scales of communities. Itachieves that by introducing a new hierarchical multiresolutionscheme that works even in cases of community detection near themodularity resolution limit. The main idea underlying thismethod is based on performing multiple resolution communitydetection on essential parts of the network, thus analyzing eachpart independently.

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 53

The method operates iteratively by first placing all nodes in asingle community. Then, it finds the minimum value of theresistance parameter which produces a community structurewith the optimal value of modularity. Finally, it runs the samealgorithm on each community that was found. The methodterminates when no more split of communities is necessary,which usually takes just a few steps.

Another approach to leveraging the results of modularityoptimization has been introduced by Chakraborty et al. in [27].It is based on the observation that a simple change to the orderof nodes in a network can significantly affect the communitystructure. However, a closer examination of the communitiesproduced in different runs of a certain community detectionalgorithm reveals that for many networks the same invariantgroups of nodes are consistently assigned to the same communi-ties. Such groups of nodes are called constant communities. Thepercentage of constant communities varies depending on thenetwork. Constant communities are detected by trying differentnode permutations while preserving the degree sequence of thenodes. For networks that have strong community structure, theconstant communities detected can be adopted as a pre-processingstep before performing modularity optimization. This can lead tohigher modularity values and lower variability in results, thusimproving the overall quality of community detection.

In the study [37] by Li et al., a new quantitative measure forcommunity detection is introduced. It offers several improve-ments over the modularity ( ), including elimination of theresolution limit and ability to detect the number of communities.The new measure called modularity density ( ) is based on theaverage degree of the community structure. It is given by

The quality of the communities found is then described by thevalue of the modularity density ( ). The larger the value of ,the stronger the community structure is.

The modularity density ( ) does not divide a clique into twoparts, and it can resolve most modular networks correctly. It canalso detect communities of different sizes. This second propertycan be used to quantitatively determine the number of commu-nities, since themaximum value is achieved when the networkis supposed to correctly partitioned. Although as mentioned in[37] finding an optimal value of modularity density ( ) is NP-hard (i.e., it belongs to Non-deterministic Polynomial-time hardclass of problems) but, it is equivalent to an objective function ofthe kernel means clustering problem for which efficientcomputational algorithms are known.

Traag et al. in [38] introduce a rigorous definition of theresolution-limit-free method for which considering any inducedsubgraph of the original graph does not cause the detectedcommunity structure to change. In other words, if there is anoptimal partitioning of a network (with respect to some objectivefunction), and for each subgraph induced by the partitioning it isalso optimal, then such objective function is called resolution-limit-free. An objective function is called additive for a certainpartitioning if it is equal to the sum of the values of this objectivefunction for each of the subgraphs induced by the partitioning.

Based on these two definitions, it is proved that if an objectivefunction is additive and there are two optimal partitions, then anycombination of these partitions is also optimal. In case of acomplete graph, if an objective function is resolution-limit-free,then an optimal partitioning either contains all the nodes (i.e.,there is only one community which includes all nodes) orconsists of communities of size 1 (i.e., each node forms acommunity of its own). A more general statement for arbitraryobjective functions is also true: if an objective function has localweights (i.e., weights that do not change when consideringsubgraphs), then it is resolution-limit-free. Although the con-verse is not true, there is only a relatively small number of specialcases when methods with nonlocal weights are resolution-limit-free.

The authors then analyze resolution-limit-free within theframework of the first principle Potts model [39]

H

where , are some weights. The intuition behind thisformula is that a community should have more edges inside itthan edges which connect it to other communities. Thus, it isnecessary to reward existing links inside a community andpenalize links that are missing from a community. The smallerthe value ofH is, the more desirable the community structure is.However, the minimal value might not be unique.

Given the definition of H, it is possible to describe variousexisting community detection methods with an appropriatechoice of parameters, as well as propose alternative methods.The following models are shown to fit into H: Reichardtand Bornholdt (RB), Arenas, Fernándes, and Gómez (AFG),Ronhovde and Nussinov (RN) as well as the label propagationmethod. RB approach with a configuration null model alsocovers the original definition of modularity. The authors alsopropose a new method called constant Potts model (CPM) bychoosing and , where is the weight ofthe edge between nodes and , and is a constant. CPM issimilar to RB and RN models, but is simpler and more intuitive.CPM and RN have local weights and are consequently resolution-limit-free, while RB, AFG, and modularity are not.

However, all of the above approaches are aimed at solvingonly the resolution limit problem. Work done by Chen et al. in[23] and [24] adopts a different definition of modularity densitywhich simultaneously addresses two problems of modularity. Itis done bymixing two additional components, Split Penalty (SP)and the community density, into the well-known definition ofmodularity. Community density includes internal communitydensity and pair-wise community density. is the fraction ofedges that connect nodes of different communities

The value of SP is subtracted frommodularity, while the valueof the community density is added to modularity and SP.

54 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

Introducing SP resolves the issue of favoring small communities.Community density eliminates the problem of favoring largecommunities (also known as the resolution limit problem). TheModularity Density ( ) is then given by

where is the internal density of community and is thepair-wise density between community and community .

Modularity Density ( ) avoids falling into the trap ofmerging two or more consecutive cliques in the ring of cliquesnetwork or dividing a clique into two or more parts. It can alsodiscover communities of different sizes. Thus, using solvesboth the resolution limit problem of modularity and the problemof splitting larger communities into smaller ones. Hence, is avery effective alternative to .

III. MAXIMIZING MODULARITY DENSITY

In our previous papers [23], [24], we have given the definitionof Modularity Density ( ). With formal proofs and experi-ments on two real dynamic datasets (Senate dataset [40] andReality Mining Bluetooth Scan data [41]), we demonstrated that

solves the two opposite yet coexisting problems of modu-larity: 1) the problem of favoring small communities; and 2) theproblem of favoring large communities (also called the resolu-tion limit problem). Moreover, for a given community indefined by (39), its internal and pair-wise densities and its splitpenalty are local components, which are related to the resolution-limit-free definition in [38]. Therefore, it is reasonable to expectthat maximizing would discover more meaningful commu-nity structure than maximizing . In this section, we firstillustrate why the greedy agglomerative algorithm for increasing

cannot be adopted for optimizing . Then, we propose afine-tuned community detection algorithm that repeatedly at-tempts to improve the community quality measurements bysplitting and merging the given network community structureto maximize .

A. Greedy Algorithm Fails to Optimize

In this section, we show why the greedy agglomerativealgorithm increasing fails to optimize it. At the first stepof the greedy algorithm for increasing , each node is treated asa single community. Then, of each node or community is

. Therefore, in order to increase the most, thegreedy algorithm would first merge the connected pair of nodeswith the sum of their degrees being the largest among all

connected pairs. However, it is very likely that those two nodesbelong to two different communities, which would finally resultin merging those two communities instead of keeping themseparate. This will result in a much lower value of for sucha merged community compared to for its components,demonstrating the reason for greedy algorithm failure inoptimizing .

For example, in the network example in Fig. 1, the initialvalues of for nodes 1,2,4,6,7, and 8 with degree 3 are

, while the initial values of for nodes3 and 5 with degree 4 are . Then, greedyalgorithm would first merge nodes 3 and 5, which would finallylead to a single community of the whole eight nodes. However,the true community structure contains two clique communities.Accordingly, the of the community structure with two cliquecommunities 0.4183 is larger than that of the community struc-ture with one single large community 0.2487. So, maximizing

properly should have the ability to discover the true com-munity structure.

B. Fine-Tuned Algorithm

In this part, we describe a fine-tuned community detectionalgorithm that iteratively improves a community quality metric

by splitting and merging the given network communitystructure. We denote the corresponding algorithm based onmodularity ( ) asFine-tuned and the one based onModularityDensity ( ) as Fine-tuned . It consists of two alternatingstages: 1) split stage and 2) merging stage.

Algorithm 1 Split_Communities ( , )

1: Initialize comWeights , comEdges , andcomDensities which respectively contain #weights,#edges, and the density inside the communities and betweentwo communities by using the network and the communitylist ;

2: //Get the metric value for each community.

3: ( , comWeights, comDensities);

4: for to do

5: ;

Fig. 1. A simple network with two clique communities. Each clique has fournodes and the two clique communities are connected to each other with one singleedge.

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 55

6: ;

7: ;

8: (fiedlerVector, ‘descend’);

9: //Form divisions and record the best one.

10: splitTwoCom.addAll(nodeIds);

11: for to do

12: splitOneCom.add(nodeIds );

13: splitTwoCom.remove(nodeIds );

14: Calculate for the split at ;

15: ;

16: if < (or > ) then

17: ;

18: ;

19: end if

20: end for

21: if > (or < ) then

22: Clear splitOneCom and splitTwoCom;

23: splitOneCom.addAll(nodeIds );

24: splitTwoCom.addAll(nodeIds );

25: newC.add(splitOneCom);

26: newC.add(splitTwoCom);

27: else

28: newC.add( );

29: end if

30: end for

31: return newC

In the split stage, the algorithm will split a community intotwo subcommunities and based on the ratio-cut method ifthe split improves the value of the quality metric. The ratio-cut

method [42] finds the bisection that minimizes the ratio ,

where is the cut size (namely, the number of edgesbetween communities and ), while and are sizesof the two communities. This ratio penalizes situations in whicheither of the two communities is small and thus favors balanceddivisions over unbalanced ones. However, graph partitioningbased on the ratio-cut method is a NP-complete problem. Thus,we approximate it by using the Laplacian spectral bisectionmethod for graph partitioning introduced by Fiedler [43], [44].

First, we calculate the Fiedler vector which is the eigenvectorof the network Laplacian matrix corresponding tothe second smallest eigenvalue. Then, we put the nodes corre-sponding to the positive values of the Fiedler vector into onegroup and the nodes corresponding to the negative values into theother group. The subnetwork of each community is generated

with the nodes and edges in that community. Although the ratio-cut approximated with spectral bisection method does allowsome deviation for the sizes and to vary around themiddle value, the right partitioning may not actually divide thecommunity into two balanced or nearly balanced ones. Thus, it isto some extent inappropriate and unrealistic for communitydetection problems. We overcome this problem by using thefollowing strategies. First, we sort the elements of the Fiedlervector in descending order, then cut them into two communitiesin each of the possible ways and calculate the correspond-ing change of the metric values of all the divisions.Then, the one with the best value (largest or smallest dependingon the measurement) of the quality metric among allthe divisions is recorded. We adopt this best division tothe community only when > (or <depending on the metric). For instance, we split the communityonly when is larger than zero.

The outline of the split stage is shown in Algorithm 1. Theinput is a network and a community list, and the output is a list ofcommunities after splitting. The initialization part hascomplexity. Computing Fiedler vector using Lanczos method[28] needs steps, where is thenumber of eigenvectors needed and is the number of iterationsrequired for the Lanczos method to converge. Here, is 2 andis typically very small although the exact number is not generallyknown. So, the complexity for calculating Fiedler vector is

. Sorting the Fiedler vector has the cost

. The search of the best division from all thepossible ones (per community ) for all the communities

is achieved in time. For the possible divisions of acommunity , each one differs from the previous one by themovement of just a single node fromone group to the other. Thus,the update of the total weights, the total number of edges, and thedensities inside those two split communities and between thosetwo communities to other communities can be calculated in timeproportional to the degree of that node. Thus, all nodes can bemoved in time proportional to the sum of their degrees which isequal to . Moreover, for Fine-tuned , computing

costs because all the communities aretraversed to update the SP for each of the divisions ofeach community . All the other parts have complexity less thanor at most . Thus, the computational complexity for thesplit stage of Fine-tuned is , while forFine-tuned it is .

Algorithm 2 Merge_Communities ( , )

1: Initialize comWeights , comEdges , andcomDensities ;

2: //Get the metric value for each community.

3: ;

4: for to do

5: for to do

6: //Doesn’t consider disconnected communities.

56 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

7: if &&then

8: continue;

9: end if

10: Calculate for merging and ;

11: ;

12: //Record the merging information with descend-ing in a red-black tree

13: if > (or < ) then

14: mergedInfos.put([ , , ]);

15: end if

16: end for

17: end for

18: //Merge the community with the one that improves the valueof the quality metric the most

19: while mergedInfos.hasNext() do

20: ;

21: if !mergedComs.containsKey(comId1) &&!mergedComs.containsKey(comId2) then

22: mergedComs.put(comId1,comId2);

23: mergedComs.put(comId2,comId1);

24: end if

25: end while

26: for to do

27: ;

28: if mergedComs.containsKey( ) then

29: ;

30: if < then

31: .addAll( .get(comId2));

32: end if

33: end if

34: newC.add( );

35: end for

36: return newC;

In the merging stage, the algorithmwill merge a community toits connected communities if the merging improves the value ofthe quality metric. If there are many mergers possible for acommunity, the one, unmerged so far, which improves thequality metric the most is chosen. Hence, each community willonly be merged at most once in each stage. The outline of themerging stage is shown in Algorithm 2. The input is a networkand a community list, and the output is a list of communities after

merging. The initialization part has the complexity . ForFine-tuned , the two “for loops” for merging any two commu-

nities have the complexity because calculatingis and inserting an element into the red-black

tree is since the maxi-

mum number of elements in the tree is . ForFine-tuned , the two “for loops” for merging any two

communities have the complexity because calculatingneeds steps to traverse all the communities

to update the SP and inserting an element into the red-black tree isas well. All other parts have complexity at most

. Thus, the computational complexity for the merging

stage of Fine-tuned is and for the

merging stage of Fine-tuned is .

Algorithm 3 Fine-tuned_Algorithm ( , )

1: ;

2: ;

3: ;

4: while

5: ;

6: ;

7: ;

8: ;

9: ;

10: end while

11: return

The fine-tuned algorithm repeatedly carries out those twoalternating stages until neither splitting nor merging can improvethe value of the quality metric or until the total number ofcommunities discovered does not change after one full iteration.Algorithm 3 shows the outline of the fine-tuned algorithm. It candetect the community structure of a network by taking a listwith asingle community of all the nodes in the network as the input. Itcan also improve the community detection results of otheralgorithms by taking a list with their communities as the input.Let the number of iteration of thefine-tuned algorithmbe denotedas . Then, the total complexity for Fine-tuned is

, while for Fine-tunedit is . Assuming

that and are constants, the complexity of the fine-tunedalgorithms reduces to . The only part of thealgorithm that would generate a nondeterministic result is theLanczos method of calculating the Fiedler vector. The reason isthat Lanczos method adopts a randomly generated vector as itsstarting vector. We solve this issue by choosing a normalizedvector of the size equal to the number of nodes in the communityas the starting vector for the Lanczos method. Then, communitydetection results will stay the same for different runs as long asthe input remains the same.

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 57

IV. EXPERIMENTAL RESULTS

In this section, we first introduce several popular measure-ments for evaluating the quality of the results of communitydetection algorithms. Denoting the greedy algorithm of modu-larity maximization proposed by Newman [7] as Greedy , wethen use the above-mentioned metrics to compare Greedy ,Fine-tuned , and Fine-tuned . The comparison uses fourreal networks, the classical clique network and the LFR bench-mark networks, each instance of which is defined with para-meters each selected from a wide range of possible values. Theresults indicate that Fine-tuned is the most effective methodamong the three, followed by Fine-tuned . Moreover, we showthat Fine-tuned can be applied to significantly improve thedetection results of other algorithms.

In Section II-B-2, we have shown that the modularity maxi-mization approach using the eigenvectors of the Laplacianmatrix is equivalent to the one using the eigenvectors of themodularity matrix. This implies that the split stage of our Fine-tuned is actually equivalent to the spectral methods. Therefore,Fine-tuned with one additional merge operation at eachiteration unquestionably has better performance than the spectralalgorithms. Hence, we do not discuss them here.

A. Evaluation Metrics

The quality evaluation metrics we consider here can bedivided into three categories: 1) Variation of Information (VI)[45] and Normalized Mutual Information (NMI) [46] based oninformation theory; 2) F-measure [47] and Normalized VanDongen metric (NVD) [48] based on cluster matching; and3) Rand Index (RI) [49], Adjusted Rand Index (ARI) [50], andJaccard Index (JI) [51] based on pair counting.

1) Information Theory-Based Metrics:Given partitions and, VI [45] quantifies the “distance” between those two

partitions, while NMI [46] measures the similarity betweenpartitions and . VI is defined as

where is the entropy function andis the Mutual Information. Then,

NMI is given by

Using the definitions

we can express VI and NMI as a function of counts only asfollows:

where is the number of nodes in community of andis the number of nodes both in community of and in

community of .2) Clustering Matching-Based Metrics: Measurements based

on clustering matching aim at finding the largest overlapsbetween pairs of communities of two partitions and .F-measure [47] measures the similarity between two partitions,whileNormalized Van Dongen metric (NVD) [48] quantifies the“distance” between partitions and . F-measure is defined as

NVD is given by

3) Pair Counting-Based Metrics: Metrics based on paircounting count the number of pairs of nodes that areclassified (in the same community or in differentcommunities) in two partitions and . Let indicate thenumber of pairs of nodes that are in the same community in bothpartitions, denote the number of pairs of nodes that are in thesame community in partition but in different communitiesin , be the number of pairs of nodes which are indifferent communities in but in the same community in

, be the number of pairs of nodes which are in differentcommunities in both partitions. By definition,

is the total number of pairs of nodes inthe network. Then, [49] which is the ratio of the numberof node pairs placed in the same way in both partitions to thetotal number of pairs is given by

Denote . Then, RI’scorresponding adjusted version, [50], is expressed as

The JI [51] which is the ratio of the number of node pairsplaced in the same community in both partitions to the number of

58 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

node pairs that are placed in the same group in at least onepartition is defined as

Each of these three metrics quantifies the similarity betweentwo partitions and .

B. Real Networks

In this section, we first evaluate the performance ofGreedy ,Fine-tuned , and Fine-tuned on two small networks(Zachary’s karate club network [52] and American collegefootball network [53]) with ground truth communities, andthen on two large networks [pretty-good-privacy (PGP) net-work [54] and autonomous system (AS) level Internet] butwithout ground truth communities.

1) Zachary’s Karate Club Network: We first compare theperformance ofGreedy , Fine-tuned , and Fine-tuned onZachary’s karate club network [52]. It represents the friendshipsbetween 34 members of a karate club at the U.S. University overa period of 2 years. During the observation period, the club splitinto two clubs as a result of a conflict within the organization. Theresulting two new clubs can be treated as the ground truthcommunities whose structure is shown in Fig. 2(a) visualizedwith the opensource software Gephi [55].

Table I presents the metric values of the community structuresdetected by the three algorithms on this network. It shows thatFine-tuned and Fine-tuned achieve the highest value ofand , respectively. However, most of the seven metrics based

on ground truth communities imply that Greedy performs thebest with only NMI and NVD indicating that Fine-tuned hasthe best performance among the three algorithms. Hence, itseems that a large or may not necessarily mean a highquality of community structure, especially for because Fine-tuned achieves the highest but has the worst values of theseven metrics described in Section IV-A. We argue that theground truth communities may not be so reasonable becauseFine-tuned and Fine-tuned in fact discover more mean-ingful communities thanGreedy does. Fig. 2(a)–(d) shows thecommunity structure of ground truth communities and thosedetected by Greedy , Fine-tuned , and Fine-tuned ,respectively. For results of Greedy shown in Fig. 2(b), wecould observe that there are three communities located at the left,the center, and the right side of the network. The ground truthcommunity located on the right is subdivided into the central andright communities, but the node 10 is misclassified as belongingto the central community, while in ground truth network itbelongs to community located on the left. Fig. 2(c) demonstratesthat Fine-tuned subdivides both the left and the right commu-nities into two with six nodes separated from the left communityand five nodes separated from the right community. Moreover,Fig. 2(c) shows that Fine-tuned discovers the same number ofcommunities for this network as algorithms presented in [9],[16], [20], and [22]. In fact, the community structure it discovers isidentical to those detected in [16], [20], and [22]. Fig. 2(d) showsthat the community structure discovered byFine-tuned differsfrom that of Fine-tuned only on node 24 which is placed in thelarger part of the left community. It is reasonable because node 24

Fig. 2. The community structures of the ground truth communities and those detected by Greedy , Fine-tuned , and Fine-tuned on Zachary’s karate clubnetwork: (a) ground truth communities, (b) communities detectedwithGreedyQ, (c) communities detectedwithFine-tunedQ, and (d) communities detectedwithFine-tuned .

TABLE IMETRIC VALUES OF THE COMMUNITY STRUCTURES DISCOVERED BY GREEDY , FINE-TUNED , AND FINE-TUNED ON ZACHARY’S KARATE CLUB NETWORK

(RED ITALIC FONT DENOTES THE BEST VALUE FOR EACH METRIC)

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 59

has three connections to the larger part to which it has moreattraction than to the smaller part with which it only has twoconnections.

In addition, analyzing the intermediate results ofFine-tunedand Fine-tuned reveals that the communities at the firstiteration are exactly the ground truth communities, which inanother way implies their superiority over Greedy . Moreover,NMI andNVD indicate thatFine-tuned is the best among the

three and all the metrics, except , show that Fine-tunedperforms better than Fine-tuned , supporting the claim that ahigher (but not ) implies a better quality of communitystructure.

2) American College Football Network: We apply the threealgorithms also to the American college football network [53]which represents the schedule of games between collegefootball teams in a single season. The teams are divided into

Fig. 3. The community structures of the ground truth communities and those detected byGreedy , Fine-tuned , and Fine-tuned on American college footballnetwork: (a) ground truth communities, (b) communities detectedwithGreedyQ, (c) communities detectedwithFine-tunedQ, and (d) communities detectedwithFine-tuned .

60 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

12 “conferences” with intra-conference games being morefrequent than inter-conference games. Those conferencescould be treated as the ground truth communities whosestructure is shown in Fig. 3(a).

Table II presents the metric values of the community struc-tures detected by the three algorithms. It shows that Fine-tuned

achieves the best values for all the nine metrics. It impliesthat Fine-tuned performs best on this football network,followed by Fine-tuned . Fig. 3(a)–(d) presents the commu-nity structure of ground truth communities and those discov-ered by Greedy , Fine-tuned , and Fine-tuned . Eachcolor in the figures represents a community. It can be seen thatthere are twelve ground truth communities in total, sevencommunities detected by Greedy , nine communities discov-ered by Fine-tuned , and exactly twelve communities foundby Fine-tuned .

Moreover, we apply Fine-tuned on the communitydetection results of Greedy and Fine-tuned . The metricvalues of these two community structures after improvementwith Fine-tuned are shown in Table III. Compared withthose of Greedy and Fine-tuned in Table II, we couldobserve that the metric values are significantly improved withFine-tuned . Further, both improved community structurescontain exactly 12 communities, the same number as the groundtruth communities.

3) PGP Network:We then apply the three algorithms on PGPnetwork [54]. It is the giant component of the network of users ofthe PGP algorithm for secure information interchange. It has 10680 nodes and 24 316 edges.

Table IV presents the metric values of the communitystructures detected by the three algorithms. Since this networkdoes not have ground truth communities, we only calculateand of these discovered community structures. Thetable shows that Greedy and Fine-tuned achieve thehighest value of and , respectively. It is worth to mentionthat the of Fine-tuned is much larger than that ofGreedy andFine-tuned , which implies thatFine-tunedperforms best on PGP network according to , followed byGreedy .

4) AS Level Internet: The last real network dataset that isadopted to evaluate the three algorithms is AS level Internet. It isa symmetrized snapshot of the structure of the Internet at the levelof autonomous systems, reconstructed from Border GatewayProtocol (BGP) tables posted by the University of Oregon RouteViews Project. This snapshot was created by Mark Newmanfrom data for July 22, 2006 and has not been previouslypublished. It has 22 963 nodes and 48 436 edges.

Table V presents the metric values of the community struc-tures detected by the three algorithms. Since this network doesnot have ground truth communities either, we only calculate

and . It can be seen from the table that Fine-tuned and

TABLE IIMETRIC VALUES OF THE COMMUNITY STRUCTURES DETECTED BY GREEDY , FINE-TUNED , AND FINE-TUNED ON AMERICAN COLLEGE FOOTBALL NETWORK

(RED ITALIC FONT DENOTES THE BEST VALUE FOR EACH METRIC)

TABLE IIIMETRIC VALUES OF THE COMMUNITY STRUCTURES OF GREEDY AND FINE-TUNED IMPROVED WITH FINE-TUNED ON AMERICAN COLLEGE FOOTBALL NETWORK

(BLUE ITALIC FONT INDICATES IMPROVED SCORE)

TABLE VVALUES OF AND OF THE COMMUNITY STRUCTURES DETECTED BY

GREEDY , FINE-TUNED , AND FINE-TUNED ON AS LEVEL INTERNET(RED ITALIC FONT DENOTES THE BEST VALUE FOR EACH METRIC)

TABLE IVVALUES OF AND OF THE COMMUNITY STRUCTURES DETECTED BY

GREEDY , FINE-TUNED , AND FINE-TUNED ON PGP NETWORK

(RED ITALIC FONT DENOTES THE BEST VALUE FOR EACH METRIC)

Fig. 4. A ring network made out of 30 identical cliques, each having five nodesand connected by single edges.

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 61

Fine-tuned achieve the highest value of and , respec-tively. Moreover, the of Fine-tuned is much larger thanthat of Greedy and Fine-tuned , which indicates that Fine-tuned performs best on AS level Internet according to ,followed by Fine-tuned .

C. Synthetic Networks

1) Clique Network:We now apply the three algorithms to theclassical network example [23]–[25], displayed in Fig. 4, whichillustrates modularity ( ) has the resolution limit problem. It is aring network comprising 30 identical cliques, each of which hasfive nodes and they are connected by single edges. It is intuitivelyobvious that each clique forms a single community.

Table VI presents the metric values of the communitystructures detected by the three algorithms. It shows thatGreedy

and Fine-tuned have the same performance. They bothachieve the highest value of but get about half of the value of

of what Fine-tuned achieves. In fact, Fine-tunedfinds exactly 30 communities with each clique being a singlecommunity. In contrast, Greedy and Fine-tuned discoveronly 16 communities with 14 communities having two cliquesand the other two communities having a single clique. Also, wetake the community detection results of Greedy and Fine-tuned as the input to Fine-tuned to try to improve thoseresults. The metric values of the community structures afterimprovement with Fine-tuned are recorded in Table VII.This table shows that the community structures discovered areidentical to that of Fine-tuned , which means that the results

of Greedy and Fine-tuned are dramatically improved withFine-tuned . Therefore, it can be concluded from Tables VIand VII that a larger value of (but not ) implies a higherquality of the community structure. Moreover, solves theresolution limit problem of . Finally, Fine-tuned iseffective in maximizing and in finding meaningful com-munity structure.

2) LFR Benchmark Networks: To further compare theperformance of Greedy , Fine-tuned , and Fine-tuned ,we choose the LFR benchmark networks [56]which have becomea standard in the evaluation of the performance of communitydetection algorithms and also have known ground truthcommunities. The LFR benchmark network that we used herehas 1000 nodes with average degree 15 and maximum degree 50.The exponent for the degree sequence varies from 2 to 3.The exponent for the community size distribution rangesfrom 1 to 2. Then, four pairs of the exponents

are chosen in order to explorethewidest spectrumofgraph structures. Themixingparameter isvaried from 0.05 to 0.5. It means that each node shares a fraction

of its edges with the other nodes in its community andshares a fraction of its edges with the nodes outside itscommunity. Thus, low mixing parameters indicate strongcommunity structure. Also, we generate 10 network instancesfor each .Hence, eachmetric value inTablesVIII–XII representsthe average metric values of all 10 instances. Since theexperimental results are similar for all four pairs of exponents

, for the sake of brevity, weonly present the results for here.

TABLE VIIIMETRIC VALUES OF THE COMMUNITY STRUCTURES OF GREEDY ON THE LFR BENCHMARK NETWORKS WITH

TABLE VIIMETRIC VALUES OF THE COMMUNITY STRUCTURES OF GREEDY AND FINE-TUNED IMPROVED WITH FINE-TUNED ON THE CLASSICAL CLIQUE

NETWORK (BLUE ITALIC FONT INDICATES IMPROVED SCORE)

TABLE VIMETRIC VALUES OF THE COMMUNITY STRUCTURES DETECTED BY GREEDY , FINE-TUNED , AND FINE-TUNED ON THE CLASSICAL CLIQUE NETWORK

(RED ITALIC FONT DENOTES THE BEST VALUE FOR EACH METRIC)

62 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

Tables VIII–X show the metric values of the communitystructures detected with Greedy , Fine-tuned , and Fine-tuned , respectively, on the LFR benchmark networks with

and varying from 0.05 to 0.5. The red italicfont in the table denotes that the corresponding algorithmachieves the best value for a certain quality metric among thethree algorithms. The results in these tables show thatGreedyobtains the best values for all the nine measurements when

, while Fine-tuned achieves the highest values ofand the best values for almost all the sevenmetrics based on

ground truth communities when ranges from 0.1 to 0.5. Also,Fine-tuned gets the second best values for and almost allthe seven metrics in the same range of . However, for thebest is Greedy , followed by Fine-tuned , and Fine-tuned

is the last.In summary, the seven measurements based on ground truth

communities are all consistentwith , but not consistentwith .This consistency indicates the superiority of over as acommunity quality metric. In addition, Fine-tuned performs

best among the three algorithms for > , which demonstratesthat it is very effective and does a very good job in optimizing .

We then take the community detection results ofGreedy andFine-tuned as the input to Fine-tuned to improve thoseresults. The measurement values of the community structuresafter improvement with Fine-tuned are displayed inTables XI and XII. The blue italic font in Tables XI and XIIimplies that the metric value in these two tables is improvedcompared to the one in Table VIII and that in Table IX, respec-tively. Then, compared with those of Greedy shown inTable VIII and those of Fine-tuned shown in Table IX, allmeasurements, except in some cases for , are significantlyimproved with Fine-tuned . This again indicates that all theseven metrics described in Section IV-A are consistent with ,but not consistent with . Interestingly, those results are evenbetter than those of Fine-tuned itself presented inTable X. Thus, it can be concluded that Fine-tuned is verypowerful in improving the community detection results of otheralgorithms.

TABLE IXMETRIC VALUES OF THE COMMUNITY STRUCTURES OF FINE-TUNED ON THE LFR BENCHMARK NETWORKS WITH

TABLE XMETRIC VALUES OF THE COMMUNITY STRUCTURES OF FINE-TUNED ON THE LFR BENCHMARK NETWORKS WITH

TABLE XIMETRIC VALUES OF THE COMMUNITY STRUCTURES OF GREEDY IMPROVED WITH FINE-TUNED ON THE LFR BENCHMARK NETWORKS WITH

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 63

V. CONCLUSION

In this paper, we review the definition of modularity and itscorresponding maximization methods. Moreover, we show thatmodularity optimization has two opposite but coexisting issues.We also review several community quality metrics proposed tosolve the resolution limit problem. We then discuss our Modu-larity Density ( ) metric which simultaneously avoids thosetwo problems. Finally, we propose an efficient and effective fine-tuned algorithm to maximize . This new algorithm canactually be used to optimize any community quality metric. Weevaluate the three algorithms, Greedy , Fine-tuned based on, andFine-tuned based on ,with sevenmetrics based on

ground truth communities. These evaluations are done on fourreal networks, and also on the classical clique network and theLFR benchmark networks, each instance of the last is definedwith parameters selected from wide range of their values. Theresults demonstrate thatFine-tuned performs best among thethree algorithms, followed by Fine-tuned . The experimentsalso show that Fine-tuned can dramatically improve thecommunity detection results of other algorithms. In addition, allthe seven quality metrics based on ground truth communities areconsistent with , but not consistent with , which indicatesthe superiority of over as a community quality metric.

ACKNOWLEDGMENT

The views and conclusions contained in this document arethose of the authors and should not be interpreted as representingthe official policies either expressed or implied of the ArmyResearch Laboratory or the U.S. Government.

REFERENCES

[1] M. E. J. Newman and M. Girvan, “Finding and evaluating communitystructure in networks,” Phys. Rev. E, vol. 69, p. 026113, Feb. 2004.

[2] S. Fortunato, “Community detection in graphs,” Phys. Rep., vol. 486,pp. 75–174, 2010.

[3] M. E. J. Newman, “Modularity and community structure in networks,”in Proc. Nat. Acad. Sci., 2006, vol. 103, no. 23, pp. 8577–8582.

[4] E. A. Leicht and M. E. J. Newman, “Community structure in directednetworks,” Phys. Rev. Lett., vol. 100, p. 118703, Mar. 2008.

[5] M. E. J. Newman, “Analysis of weighted networks,” Phys. Rev. E, vol. 70,p. 056131, Nov. 2004.

[6] M. E. J. Newman, “Fast algorithm for detecting community structure innetworks,” Phys. Rev. E, vol. 69, p. 066133, Jun. 2004.

[7] A. Clauset, M. E. J. Newman, and C. Moore, “Finding communitystructure in very large networks,” Phys. Rev. E, vol. 70, p. 066111,Dec. 2004.

[8] K. Wakita and T. Tsurumi, “Finding community structure in mega-scalesocial networks: [extended abstract],” in Proc. 16th Int. Conf. World WideWeb, New York, NY, USA, 2007, pp. 1275–1276.

[9] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fastunfolding of communities in large networks,” J. Statist.Mech.: TheoryExp.,vol. 2008, no. 10, p. P10008, 2008.

[10] M. E. J. Newman, “Finding community structure in networks usingthe eigenvectors of matrices,” Phys. Rev. E, vol. 74, p. 036104,Sep. 2006.

[11] T. Richardson, P. J. Mucha, and M. A. Porter, “Spectral tripartitioning ofnetworks,” Phys. Rev. E, vol. 80, p. 036111, Sep. 2009.

[12] S. White and P. Smyth, “A spectral clustering approach to finding commu-nities in graph,” inProc. SIAM Int. Conf.DataMining,Newport Beach,CA,USA, Apr. 2005, pp. 274–285.

[13] J. Ruan and W. Zhang, “An efficient spectral algorithm for networkcommunity discovery and its applications to biological and social net-works,” in Proc. 7th IEEE Int. Conf. Data Mining, Washington, DC,USA, 2007, pp. 643–648.

[14] J. Ruan and W. Zhang, “Identifying network communities with a highresolution,” Phys. Rev. E, vol. 77, p. 016104, Jan. 2008.

[15] M. E. J. Newman, “Spectral methods for network community detection andgraph partitioning,” Phys. Rev. E, vol. 88, p. 042822, 2013.

[16] J. Duch and A. Arenas, “Community detection in complex networks usingextremal optimization,” Phys. Rev. E, vol. 72, p. 027104, Aug. 2005.

[17] R. Guimerà and L. A. N. Amaral, “Functional cartography of complexmetabolic networks,” Nature, vol. 433, pp. 895–900, Feb. 2005.

[18] R. Guimerà and L. A. N. Amaral, “Cartography of complex networks:Modules and universal roles,” J. Statist. Mech.: Theory Exp., vol. 2005,no. 02, p. P02001, 2005.

[19] C. P. Massen and J. P. K. Doye, “Identifying communities within energylandscapes,” Phys. Rev. E, vol. 71, p. 046101, Apr. 2005.

[20] A. Medus, G. Acun̈a, and C. Dorso, “Detection of community structures innetworks via global optimization,” Phys. A: Statist. Mech. Appl., vol. 358,no. 24, pp. 593–604, 2005.

[21] M. Sales-Pardo, R. Guimerà, A. A. Moreira, and L. A. N. Amaral,“Extracting the hierarchical organization of complex systems,” Proc. Nat.Acad. Sci. U. S. A., Sep. 2007, vol. 104, no. 39, pp. 15224–15229.

[22] G. Agarwal and D. Kempe, “Modularity-maximizing graph communitiesvia mathematical programming,” Eur. Phys. J. B, vol. 66, no. 3,pp. 409–418, 2008.

[23] M. Chen, T. Nguyen, and B. K. Szymanski, “On measuring the quality of anetwork community structure,” in Proc. ASE/IEEE Int. Conf. SocialComput., Washington, DC, USA, Sep. 2013, pp. 122–127.

[24] M. Chen, T. Nguyen, and B. K. Szymanski, “A new metric for quality ofnetwork community structure,” ASE Hum. J., vol. 2, no. 4,pp. 226–240, 2013.

[25] S. Fortunato and M. Barthèlemy, “Resolution limit in community detec-tion,” Proc. Nat. Acad. Sci., vol. 104, no. 1, pp. 36–41, 2007.

[26] U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. Nikoloskiet al., “On modularity clustering,” IEEE Trans. Knowl. Data Eng., vol. 20,no. 2, pp. 172–188, Feb. 2008.

[27] T.Chakraborty, S. Srinivasan,N.Ganguly, S. Bhowmick, andA.Mukherjee,“Constant communities in complex networks,” Nature Scientific Reps. 3,May 2013, Art. No. 1825.

[28] J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst, in Templatesfor the Solution of Algebraic Eigenvalue Problems: A Practical Guide,Z. Bai, Ed. Philadelphia, PA, USA: SIAM, 2000.

[29] S. Boettcher and A. G. Percus, “Optimization with extremal dynamics,”Phys. Rev. Lett., vol. 86, pp. 5211–5214, Jun. 2001.

TABLE XIIMETRIC VALUES OF THE COMMUNITY STRUCTURES OF FINE-TUNED IMPROVED WITH FINE-TUNED ON THE LFR BENCHMARK NETWORKS WITH

64 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 1, NO. 1, MARCH 2014

[30] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulatedannealing,” Science, vol. 220, no. 4598, pp. 671–680, May 1983.

[31] H. Karloff, Linear Programming. Boston, MA, USA: Birkhäuser, 1991.[32] J. W. Berry, B. Hendrickson, R. A. LaViolette, and C. A. Phillips,

“Tolerating the community detection resolution limit with edgeweighting,”Phys. Rev. E, vol. 83, p. 056119, May 2011.

[33] S. Muff, F. Rao, and A. Caflisch, “Local modularity measure for networkclusterizations,” Phys. Rev. E, vol. 72, no. 5, p. 056107, 2005.

[34] A. Arenas, A. Fernandez, and S. Gomez, “Analysis of the structure ofcomplex networks at different resolution levels,” New J. Phys., vol. 10,no. 5, p. 053039, 2008.

[35] A. Lancichinetti and S. Fortunato, “Limits of modularity maximization incommunity detection,” Phys. Rev. E, vol. 84, no. 6, p. 066122, 2011.

[36] C. Granell, S. Gomez, andA. Arenas, “Hierarchical multiresolutionmethodto overcome the resolution limit in complex networks,” Int. J. Bifur. Chaos,vol. 22, p. 1250171, 2012.

[37] Z. Li, S. Zhang, R.-S. Wang, X.-S. Zhang, and L. Chen, “Quantitativefunction for community detection,” Phys. Rev. E, vol. 77, no. 3,p. 036109, 2008.

[38] V.A. Traag, P.VanDooren, andY.Nesterov, “Narrow scope for resolution-limit-free community detection,” Phys. Rev. E, vol. 84, no. 1,p. 016114, 2011.

[39] J. Reichardt and S. Bornholdt, “Partitioning and modularity ofgraphs with arbitrary degree distribution,” Phys. Rev. E, vol. 76, p. 015102,Jul. 2007.

[40] A. S. Waugh, L. Pei, J. H. Fowler, P. J. Mucha, and M. A. Porter,Party polarization in congress: A network science approach, Preprint,2011 [Online]. Available: http://arxiv.org/abs/0907.3509.

[41] N. Eagle, A. Pentland, and D. Lazer, “Inferring social network structureusing mobile phone data,” in Proc. Nat. Acad. Sci. (PNAS), 2009, vol. 106,no. 36, pp. 15274–15278.

[42] Y.-C.Wei andC.-K.Cheng, “Towards efficient hierarchical designs by ratiocut partitioning,” in Proc. IEEE Int. Conf. Comput. Aided Des., 1989,pp. 298–301.

[43] M. Fiedler, “Algebraic connectivity of graphs,” Czech. Math. J., vol. 23,pp. 298–305, 1973.

[44] A. Pothen, H. D. Simon, and K.-P. Liou, “Partitioning sparse matrices witheigenvectors of graphs,” SIAM J. Matrix Anal. Appl., vol. 11, no. 3,pp. 430–452, May 1990.

[45] B. H. Good, Y.-A. de Montjoye, and A. Clauset, “Performance of modu-larity maximization in practical contexts,” Phys. Rev. E, vol. 81, p. 046106,Apr. 2010.

[46] L. Danon, A. Díaz-Guilera, J. Duch, and A. Arenas, “Comparing commu-nity structure identification,” J. Statist. Mech.: Theory Exp., vol. 2005,no. 09, p. P09008, 2005.

[47] S. Wagner and D. Wagner, “Comparing clusterings—An overview,”Univ.Karlsruhe (TH), Karlsruhe, Germany, Tech. Rep. 2006–04, 2007.

[48] S. Van Dongen, “Performance criteria for graph clustering and Markovcluster experiments,” Nat. Res. Inst. Math. Comput. Sci., Amsterdam,The Netherlands, Tech. Rep., 2000.

[49] W. M. Rand, “Objective criteria for the evaluation of clustering methods,”J. Amer. Statist. Assoc., vol. 66, no. 336, pp. 846–850, 1971.

[50] L. Hubert and P. Arabie, “Comparing partitions,” J. Classification, vol. 2,no. 1, pp. 193–218, 1985.

[51] A. Ben-Hur, A. Elisseeff, and I. Guyon, “A stability based method fordiscovering structure in clustered data,” in Proc. Pacific Symp. BIOCOM-PUTING, 2001, pp. 6–17.

[52] W. Zachary, “An information flow model for conflict and fission in smallgroups,” J. Anthropol. Res., vol. 33, pp. 452–473, 1977.

[53] M. Girvan and M. E. J. Newman, “Community structure in social andbiological networks,” Proc. Nat. Acad. Sci., vol. 99, no. 12,pp. 7821–7826, 2002.

[54] M. Boguñá, R. Pastor-Satorras, A. Díaz-Guilera, andA.Arenas, “Models ofsocial networks based on social distance attachment,” Phys. Rev. E, vol. 70,p. 056122, Nov. 2004.

[55] M. Bastian, S. Heymann, andM. Jacomy, “Gephi: An open source softwarefor exploring and manipulating networks,” in Proc. Int. AAAI Conf.Weblogs and Social Media (ICWSM), San Jose, CA, USA, May 2009,pp. 361–362.

[56] A. Lancichinetti, S. Fortunato, and F. Radicchi, “Benchmark graphsfor testing community detection algorithms,” Phys. Rev. E, vol. 78,p. 046110, Oct. 2008.

Mingming Chen is pursuing the Ph.D. degree in the Department of ComputerScience, Rensselaer Polytechnic Institute, Troy, NY, USA.

Konstantin Kuzmin (S’09–GSM’13) is pursuing the Ph.D. degree in theDepartment of Computer Science, Rensselaer Polytechnic Institute, Troy, NY,USA.

Boleslaw K. Szymanski (M’85–SM’87–F’99) is a Claire and Roland SchmittDistinguished Professor in the Department of Computer Science, RensselaerPolytechnic Institute, Troy, NY, USA.

CHEN et al.: COMMUNITY DETECTION VIA MAXIMIZATION OF MODULARITY AND ITS VARIANTS 65