29080Team #29080 page 2 of 18 In the co-author network of Erdös1, ignore the frequency of collaboration. To a certain degree, the citations can be a pretty good measure for the influence

Abstract Identifying the influence of authors and papers is important for fostering academic

research. In this paper, we build three kinds of networks (co-author network, paper

citation network and author citation network) and employ two measures (PageRank,

local centrality) to identify the influence of academic research.

We build and analyze the co-author network of Erdös1 authors. Skills of data

extraction has been used to simplify the process of network building. To analyse

properties of the network, degree distribution, clustering coefficient and average

distance are calculated. After that, we find the existence of the small world

phenomenon in this network. Furthermore, we employ PageRank algorithm for

undirected network and centrality measure to determine the influence of authors.

Through the analysis of 511 nodes in the co-author network, we find that Frank Harary

is the most influential author. To test the effectiveness of the method, we use citations

of authors as a reference.

A paper citation network is built to identify the influence of papers. Some papers are

added to the given data set so as to gain a connected network. We introduce “local

centrality” measure to determine the influence of paper. PageRank measure is given

as a comparison to prove the effectiveness of “local centrality” measure. We conclude

that Collective dynamics of `small-world' networks is the most influential paper in our

paper data set.

Projecting the paper citation network into an author citation network, we use

weighted PageRank algorithm ranking the influence of authors and visualize it. Erdös

is considered as the most influential scientist in Network Science.

In order to test our influence measurement algorithm, we implement it on Facebook

data set. The method of dealing with the co-author network is used to cope with the

Facebook social network because of similarities of the two networks. It performs well

with the empirical support.

Finally, sensitivity analysis proves that our algorithm is robust. Discussion on the

power of network analysis is included in our paper.

Keywords: Local Centrality, PageRank Algorithm, Network science,

Centrality measures, Social Network Analysis

For office use only

T1 ________________

T2 ________________

T3 ________________

T4 ________________

Team Control Number

29080

Problem Chosen

C

For office use only

F1 ________________

F2 ________________

F3 ________________

F4 ________________

2014 Mathematical Contest in Modeling (MCM) Summary Sheet (Attach a copy of this page to your solution paper.)

Team #29080 page 1 of 18

Identifying Influential Nodes in the Network:

PageRank and Local Centrality Measures

1. Introduction

Network science has been proven to be effective in solving problems in multiple

fields. In the field of Informetrics, large number of citations, co-citation and co-author

networks have been built to determine the influence of papers and scientific authors.

Prior literatures used measures of social network analysis (SNA) in citation and

coauthor networks to dig information for evaluating the author influence. E. Otte et al.

studied SNA in the co-author network, and concluded that centrality can be used to

find core specialists in certain field [1]. Abbasi A. et al. used measures from SNA for

examining the effect of networks on the (citation-based) performance of scholars [2].

Also, Martins M E et al. discussed centrality and small-world phenomenon in the

collaborative network [3]. PageRank algorithm is used to analyze authors in co-citation

networks [4].

In addition, effective measures based on authors’ work, such as, citations, h-index,

g-index are proposed. The number of citations qualifies the quantity of publications

[5].Hirsch introduced the h-index as a measure that combines the quantity of

publications and the impact of publications [6]. Another widely used index for

evaluation is g-index, which is introduced by Egghe [7].

The literature research on using network to study influence of paper is few. There

are studies assessing the relative influence of journals [8, 9]. In this paper, the problems

we are going to investigate include:

analyzing influence of authors in Erdös1 network,

determining relative influence of foundation papers in a certain field, and

implementing algorithms on a completely different set of network influence data.

2. Assumptions

For simplicity, we make the following assumptions in this paper:

The error of the data is considered acceptable. Difficulties in author identification,

surely make the data less than 100% accurate, but we believe the errors will not

imfluence our result.

Assume that the co-authors have done the same contributions to their paper.

That is, we ignore differences in co-authors’ contribution to the paper. This is

reasonable, for it is hard to distinguish how much work has been conducted by

each author.

For a given paper, all its referenced papers share the same influence over it.

That is, referenced papers contribute to the result of paper equally.


In the co-author network of Erdös1, ignore the frequency of collaboration.

To a certain degree, the citations can be a pretty good measure for the influence

of authors and papers. We use it to test our measures.

The influence of college, or organization is determined by the influence of

authors and papers. This is intuitive, for both the college and journals can be

regarded as a set of papers or a set of scientific authors.

3. Co-author Network of the Erdös1 Authors

3.1 Building Co-author Network

Based on the given data Erdös1.htm, we build our co-author network, the process

of data extracting and network building as follows:

Step1. Obtain 511 Erdös1 authors’ name, assign each of them an identity (ID)

number from 1 to 511.

Step2. Build a mapping table which map Erdös1 co-authors’ name to its ID number

Step3. For Erdös1 author whose ID number is u, obtain one of its co-authors’ name,

search it in mapping table, decide whether this co-author is Erods1 author. If is,

obtain its ID v, build an edge between u and v; otherwise, do nothing.

Step4. Repeat Step3 until all co-authors of u have been inquired.

Step5. Repeat Step4 for all Erdös1 authors.

Figure 1 show the co-author network of the Erdös1 authors. The co-author

network is depicted through a graph. The nodes i of the graph represent authors. A

link between node i and node j indicates a collaboration relationship between authors.

Figure 1. Co-author network of the Erdös1 authors


This network is not a connected graph since there are some isolated nodes in the

left. The implication of isolated nodes is that some Erdös1 co-authors did not

collaborate with other co-authors of Erdös.

In the next sections, we can limit the size of the network by taking no consider of

isolated nodes, in which case, the network is a connected graph. It is convenient for us

to analyze the properties of the network or influence of nodes.

3.2 Analyzing Properties of Co-author Network

The properties of the network we are going to analyze include: degree distribution,

clustering coefficient, average distance. In addtion, we devote to find small-world

phenomenon in the Erdös1 network.

Degree Distribution

We consider degree distribution of co-authorship network first. Let du denote the

degree of node u in the co-author network, which corresponds to the number of co-

authors of individual u. The degree distribution of network is shown in Figure2. Note

that k represents the degree; p(k) is the fraction of nodes with degree equal to k.

Figure 2. Degree distribution of co-author network

The degree distribution is plotted in log-log plot, we fit the degree distribution

with the log-log function as follows:

lg( ( )) lg( )p k k c (1)

Both the fitted equation and the fitted line are showed in the Figure 2. It indicates

the degree of the network obey power-law distribution, with a wide range of values.

This feature is in line with prior literatures. It suggests that a small number of authors

with many collaborators and a lot of authors with few collaborators.


Clustering coefficient

The clustering coefficient of node u measures the cluster degree of its neighbours,

and can be obtained as follows:

1 2( ) /

uu

u u

ECC

d d (2)

where, Eu is the number of edges among neighbors of node u, and du is the degree of u.

In the Erdös1 network, the clustering coefficient is 0.28.

Average Distance

The average distance is the average length of the shortest path among all pairs of

nodes. High average distance indicates that resources, such as information, must pass

by a large number of intermediaries to travel between nodes in the network [1].

Due to the above network is not connected one. We restrict attention to the giant

connected component of this graph, which includes 92.75% of all authors listed in the

Erdös1. Hence, the average distance is 3.82 here. It is rather very small value, given the

size of network, supporting the general notion that social networks form a “small world”

[10].

Small world

A more rigorous analysis on small world is given following the Watts, D. and

Strogatz, S. [11]. Based on the observed co-author network, the random co-author

network is built with the same number of nodes (authors) and average number of

edges per node. By comparing the parameters (observed and random), we can analyze

whether this network is “small world”.

The clustering coefficient and average distance of both network (observed and

random ) are calculated and shown in Table 1.

Table1. Small world‘s statistics

As presented in Table 1, we have Lobs～Lexp and CCobs≫CCexp. Therefore, this

network does show the small-world phenomenon. It means that the authors are

connected to any other authors in the network through a small number of

intermediaries [11].

Variables Value

Observable data

Authors n 511

Average number of ties per author k 11.86

Lobs: average distance l 3.82

CCobs: clustering coefficient CC 0.28

Random data

Lexp: expected average distance ln(n)/ln(k) 3.58

CCexp: expected clustering coefficient k/n 0.012


3.3 Co-author Influence Measure

In this section, PageRank algorithm, centralities of nodes in the network are used

to measure co-author relative influence. The PageRank algorithm is employed to dig

the network structural information while centralities of nodes measure importance of

nodes. At last, the importance of authors’ works is used to test the result.

PageRank for Undirected Network

PageRank is a ranking algorithm used to determine the importance of nodes in the

network. Initially, it’s designed for web pages ranking [12]. The network of web pages

is directed network which consists of billion nodes and links. It also can be applied to

undirected network [13]. We give the PageRank algorithm for undirected network as

follows:

Step1. Convert an undirected network to the directed one by changing one

undirected edge to two directed edges. It is shown in Figure 3.

Step2. Use PageRank algorithm with the damping factor [13]. Given a directed

network of N nodes i=1,2,…,N, the PRi for the ith node is defined by the recursion

formula:

( 1)( ) (1 )

i

j

i outj B j

PR m dPR m d

k N (3)

here, Bi is the set of nodes that point to i, kjout is the out-degree of node j, and d

which called ‘damping factor’ is a parameter that control the performance of

PageRank algorithm. m is the iterations of calculating. After plenty of iterations,

PR will converge and we get PageRank for each node.

Step3. Rank the nodes by the PageRank lists and test performances of the

algorithm in different damping factors.

Figure 3. Converting undirected network to directed network

We rank the authors in the Erdös1 network with the above algorithm. The top ten

authors who have significant influence are shown in Table 2.


Table 2. Top ten influential authors ranked by PageRank

Frank Harary is ranked as the most influential author in our PageRank result. This

is reasonable since he specialized in graph theory and was widely recognized as one

of the "fathers" of modern graph theory [14].

Identifying the influence of authors only with PageRank is not enough. Centrality

is configured as a property that measures how central a node is in a network.

Centrality [15] The most important centrality measures are: degree centrality, closeness centrality and

betweenness centrality.

Degree centrality of a node is defined as the number of ties this node has. In a co-

author network the degree centrality of a node(author) is just the number of nodes in

the network with whom she has co-authored at least one article. In mathematical terms

standardized degree centrality, Dsi, of node i is defined as:

1

1si ij

j

D en

(4)

where, eij=1 if there is a link between nodes i and j, and eij=0 if there is no link. n is the

number of authors in the network.

Closeness centrality of a node is equal to the total distance of this node from all other

nodes. In mathematical terms standardized closeness centrality, Csi, of node i is defined

as:

si

ijj

nC

p

(5)

where, pij is the length of the shortest path from node i to node j.

Betweenness centrality may be defined as the number of shortest path of two

arbitrary nodes pass a given node. In mathematical terms standardized betweenness

centrality, Bsi, of node i is defined as:

2

1 2 ,( )( )

jik

sij k jk

gB

n n g

(6)

where, jkg is the number of shortest paths from node j to node k (j, k≠i), and jikg is

the number of shortest paths from node j to node k passing through node i.

Rank Name Rank Name

1 Frank Harary* 6 Vojtech Rodl

2 Noge M. Alon 7 Zsolt Tuza

3 Ronald Lewis Graham 8 Carl Bernard Pomerance

4 Bela Bollobas 9 Zoltan Furedi

5 Vera Turan Sos 10 Joel Harold Spencer


Figure 4. Centralities of nodes in three dimensional coordinates

We calculate above centrality measures and plot it in a 3-demensional coordinates

which is shown as Figure 4.

Note that the x, y, z axis represents degree centrality, closeness centrality and

betweenness centrality respectively. As three centrality measures are used to

determine the influence of authors, the higher centrality, the more influential the

author. Hence, in the three dimensional coordinates, nodes which are plotted in the

upper-right have significant influence.

The red points in the Figure 4 are the top ten influential authors in the Erdös

network which calculated by the PageRank. Clearly, the centrality measure result is

compatible with the PageRank result. In this case, it indicates effectiveness of our

model.

Empirical Support

Both centrality measure and PageRank, we take no considerate of the importance

of authors’ work. Actually, it must be an essential measure. We add it in this section

pursuing an overall influence appraisal for authors in the Erdös 1 network.

Plenty of indexes have been studied and used to measure the academic influence

of scientific authors, such as h-index and total citations. Actually, all this indexes

measure the importance of authors’ works. To determine who in this Erdös 1 network

has significant influence, we take total citations as a measurement of importance of

authors’ work. Both h-index and g-index are proved to be relatively effective in

measuring authors’ work. However, since many works of the authors in Erdös1

network was done before 1995, and the h-index in Scopus databases[16] is calculated

only consider citations after 1995, it is not reasonable to employ h-index here. In

addition, total citation can be a pretty good indicator for it reflects both the quantity

and the quality of authors’ work.

The method we propose is to compare the citations of our PageRank top10 authors

with the citations of other 20 authors randomly selectted in Erods1 network. We


visualize the number of citations in the Figure 5 for comparison.

Figure 5. Comparison of citations

As is shown in the Figure 5, most of the PageRank top 10 authors have citations

greater than 2000, while all randomly selected author below 2000. It indicates that the

PageRank top 10 authors have high citations on average comparing with randomly

selected author. Therefore, we give a set of authors who have significant influence

rather only give one.

4. Citation Network of Papers

4.1 Building Citation Network

Data and Data Processing

The set of papers we use to construct a citation network includes all paper from the

attached list (NetSciFoundation.pdf) and other 6 papers in the field of network science

we collected via Google Scholar. Note that the fifth paper in the NetSciFoundation.pdf,

which write by Borgatti, S and titled “Identifying sets of key players in a network”, is

not published on Journal of Computational and Mathematical Organization Theory in 2006

but Journal of Integration of Knowledge Intensive Multi-Agent Systems in 2003. In fact,

Borgatti, S has two paper which has similar title, shown as follows:

Borgatti S P. Identifying sets of key players in a network[C]//Integration of

Knowledge Intensive Multi-Agent Systems, 2003. International Conference on.

IEEE, 2003: 127-131.

Borgatti S P. Identifying sets of key players in a social network [J]. Computational

& Mathematical Organization Theory, 2006, 12(1): 21-34.


Specifically, we chose the first one. The choice has no impact on measuring the

relative importance of paper. We label each paper with number from 1 to 22. According

to the published time, paper in attached list is labeled with number 1~16, paper we

discovered is labeled with number 17~22. For simplicity, in the following part, we use

the ID number to refer papers.

Citation Network

For a given set of foundation papers, to determine their relative influence, we build

a citation network. We find edges of citation network based on references of paper.

Figure 6 show the citation network. If paper 4 cited paper 3, there is an edge from

number 4 point to number 3.

Figure 6. Citation network of 22 papers in the field of network science.

4.2 “Local centrality” Measure

An intuitive idea of influence of paper should be, either paper is cited by highly

cited paper, or is cited highly by massive papers, it is influential. To determine the

relative influence of those papers within the network, we introduce a model base on

the citations and the idea of local centrality [17].

Chen et al. propose local centrality to identify influential nodes in complex

networks [17]. The local centrality is proved to be effective in measuring importance of

nodes in large-scale networks. The local centrality CL (v) of node v is defined as:

( )

( )w u

Q u N w

(7)


( )

( ) ( )Lu v

C v Q u

(8)

where, Γ (u) is the set of the nearest neighbors of node u and N (w) is the number of

the nearest and the next nearest neighbors of node w.

The idea of local centrality is, for node u, using the number of a certain scope of

nodes (distance lower than 4) to determine the influence of node u.

Our approach is, for paper u, using the sum of the total citations of papers which

cite paper u, and papers which indirectly cite paper u with 2 direct citation, to the

influence of paper u. In mathematical terms, our ‘local centrality’ of paper u, '( )LC u is

defined as:

'

( ) ( ) ( )

( ) ( ) ( )Lv u v u w v

C u C v C w

(9)

where, C(u) is the number of citations of node u.

Table 3. Influence rank of paper measured by ‘local centrality’

Rank 1 2 3 4 5 6 7 8 9 10 11

ID 5 17 18 19 1 6 10 7 8 21 3

Rank 12 13 14 15 16 17 18 19 20 21 22

ID 12 20 9 11 22 13 2 4 14 15 16

Table 3 show the Influence rank of paper measured by ‘local centrality’. ID number

5 paper that ranks the first is “Collective dynamics of ‘small-world’ networks”, which

is widely regarded as most influential fundamental paper in the network science field.

The ID number 17 paper is “On the evolution of random graphs”;18 is “Internet:

Diameter of the world-wide web”; 19 is “The large-scale organization of metabolic

networks”. They are all very popular literature and much important works have

followed from their publication. 15 and 16 are not so influential because they are

published in 2007 and 2010, respectively, and have no citations in the network.

4.3 Using PageRank Measure for Comparison

We also rank the set of paper based on PageRank, and show the result in Table 4

for comparison.

Table 4. Influence rank of paper measured by PageRank

Rank 1 2 3 4 5 6 7 8 9 10 11

ID 5 1 17 3 6 18 8 19 10 7 21

Rank 12 13 14 15 16 17 18 19 20 21 22

ID 12 20 13 9 11 22 2 4 14 15 16

This results is similar to rank measured by ‘local centrality’ but not so effective since

ID number 3 ranks top 5 while it has less citations. The reason is that “local centrality”

consider both structure of the network and the works follow from a paper.


4.4 Similar Measure for Individual Researcher

The core idea of our measure is that the influence of paper u is determined by the

quality and quantity of paper v that cite paper u. The quality of paper is measured by

the number of citations. To measure the individual in the citation work, this idea

continues to be useful, because the author’s influence is transmitted by paper. In

addition, co-authors share the influence of their paper.

A weighted author citation network(WACN) between authors can be easily

determined as a particular projection of the paper citation network(PCN), shown as

Figure 7 [18].

Figure 7. Projection of the PCN into a WACN [18]

Note: (a) the paper i, written by two authors i1 and i2, cited by two papers j and k,

written by one author j1 and two authors k1 and k2. (b) the WACN is then simply

generated by connecting with a directed link both i1 and i2 to j1, each with weight of

1/2, and to k1 and k2, each with weight of 1/4 [18].

We then utilize weighted PageRank algorithm to rank the influence of authors in

the weighted author citation network. The weighted PageRank algorithm is different

from the normal one because its edge weight is different. The recursion formula in the

weighted PageRank is a deformation of the formula (10):

( 1)( ) (1 )

i

j

i jioutj B j

PR m dPR m d w

s N (10)

outj jkk

s w (11)

Here wji is the weight of the directed connection from j to i, sjout is the outstrenght of

the node j. So we just change 1/ kjout in formula (10) into wji/ sjout, which means the

probability of a random walk from node j to node i.

Figure 8 shows the result. The size of the node is used to represent the PageRank

value. The larger size the nodes, the more influential the author. Obviously, author like

Erdös and Strogatz who did some innovative benchmark work is very famous in our

model.


Figure 8 Influence of individual researchers in the author citation network.

4.5 Authors and Their Papers

We conduct a consistency check by analyzing the relationship between authors and

their papers, to check our empirical deduction that influential authors has influential

papers.

Figure 9 The relationship between authors and their papers


As is shown in Figure 9. The left and right numbers represent the rank of papers

and authors, respectively.The arrow point from the author to all his papers. Red lines

means the influence between the papers and the authors is consistent. The implication

is that influential author has influential papers. The number of red lines is greater than

black lines ,which indicates that our rank for both the papers and individual

researchers is effective and persuasive.

4.6 University, Department, and Journal

As we already mentioned in assumptions, colleges, departments and journals can

be regarded as a set of papers or a set of scientific authors. The influence of a college

can be calculated by the total influence of scientists who work there and publications

of its researchers. Via building co-author networks and citation networks. The

influence of a college can be easily and reasonably evaluated. It is the same when

evaluating department influence and journal influence.

5. Implementing Our Algorithm on Facebook Dataset

In order to test our influence measurement algorithm, we implement it in a

different network. As Social Networking Service (SNS) enjoys a boom in the present

century, we determine to use our algorithm in ‘friends circles’ of Facebook so as to

identify the most influential person in the social network.

5.1 Building Social Network

Our dataset, which comes from Stanford Network Analysis Project [19], consists

of ‘circles’ (or 'friends lists') from Facebook. The dataset includes node features, circles,

and ego networks.

We build a social network via a similar process of building a co-author network.

The different meaning of edges is the difference between two networks. That is, one

edge in a social network represents that the corresponding two persons know each

other while the one in a co-author network means that the corresponding authors

worked together for an article.

.

Table 5 Social Network Measures

Variables Value

Observable data

nodes n 4039

Average number of ties per node k 21.85

Lobs: average distance l 3.69

CCobs: clustering coefficient CC 0.61

Random data

Lexp: expected average distance ln(n)/ln(k) 2.61

CCexp: expected clustering coefficient k/n 0.01


Table 5 shows a characterization of the social network from our dataset. We can

easily find that L～Lexp and CC≫CCexp. Therefore, this network does show the small-

world phenomenon. It means that two persons in the network are ‘close’ to each

other

5.2 Methodology and Results

Different at the edges’ meaning are, the social network is so similar to the co-author

network. For instance, they both are undirected and unweighted graphs. Co-author

network even can be regarded as a social network in some degree because authors in

one identical article have large possibility to know each other.

According to the similarity of two networks, we use the same method to analyze

the social network. Firstly, we employ ‘PageRank for Undirected Network’ algorithm

to calculate the PageRank of each node so as to get a rank list. Secondly, we get the

centralities (degree centrality, closeness centrality and betweenness centrality) of each

node after some calculating. After that, we compare the top20 nodes in our PageRank

list via plotting their features of centralities to argue that our algorithm is correct.

Figure 10 shows the result of plotting. (The red points represent the top20 nodes in

PageRank list

5.3 Empirical Support

Empirical analysis of social networks finds that an influential person tends to have higher betweenness centrality or closeness centrality because he/she is the ‘bridge’ and

the center of his/her circles of friends.

Figure 10 illustrates the distribution of three centralities of nodes. We can easily

find that top20 nodes in our ranking list also have higher betweenness centrality and

closeness centrality, which is an evidence for the correctness of our method

Figure 10．Centralities of nodes in three dimensional coordinates


6. Sensitivity Analysis

Figure 11. Performances with different damping factors.

Note: The performance of ‘PageRank for undirected network’ algorithm with

different damping factors. The co-authorship network is used to test the algorithm

robustness.

We discuss to what extent the results depend on the parameters. Figure 11 displays

the performance of our algorithm with different damping factors. As we can see,

although the value of PageRank for an identical node is different due to different

factors, the rank list of nodes changes little, substantially proving the inherent stability

of our algorithm.

We can also find that with the increasing of the damping factor, the rank list become

ambiguous, espically when the parameter d=0.85 (see the green line in Figure 11) or

more. In the original PageRank algorithm[12], d was chosen to be 0.15. This value was

proposed from the empirical observation that an individual surfing the web will

typically follow of the order of 6 hyperlinks before changes the search, corresponding

to a probablity d=1/6～0.15[13]. Similarly, we argue that the damping factor in a co-

authorship network should be limited in a range 0.15～0.85 form the observation of

the network structure.

7. Strengths and Weaknesses

7.1 Strengths

Consistency. Although our influence measurement algorithms only consider the

feature of the network, they match the empirical results (e.g., the rank list in co-

authorship network matches well with authors’ citation rank list). In addition,

under small changes, like adjusting the damping factor in our algorithm, the


results of the model change little.

Flexibility. Our methods easily adapts to problems with different kinds of

networks, such as social networks. Additionally, as the time complexity is

polynomial, our algorithms are suitable for dealing with large amount of data.

Less assumptions required. By using weighted PageRank algorithm, we can

reduce the number of assumptions. In order to simplify data processing, we ignore

the strength of corporation and citation, but we can consider them in our algorithm

if we have plenty of data.

7.2 Weaknesses

Complicated data collecting. In order to get the citation network, we download

all the papers required and search the references manually, which takes lots of time.

This method of building a network cannot implement on a large dataset.

Little consideration on features of nodes. We treat every node equally in our

algorithm. However, some feature of the node, like the author’s prestige in co-

authorship network, should be considered in the algorithm.

8. Conclusion and Discussion

8.1 Conclusion

We build three kinds of networks (co-author network, paper citation network and

author citation network) and employ two influence measurement algorithms

(PageRank, local centrality) so as to determine influence of academic research. After

these analysis, we try to propose a general methoddology. Furthermore, we implement

our method in social network analysis and it performs well.

Co-author network is undirected and unweighted. We use ‘PageRank for

Undirected Network’ algorithm to get the ranked list and compare it to the centrality

measurement. The results are surprising. To test our algorithm, we use the citation

indicator as a standard and they match with each other. We also test the algorithm by

changing the damping factor.

Paper citation network is directed and unweighted. We employ ‘Local centrality’

indicator as a measurement of influence and compared it with the results of PageRank

algorithm. It performs better to some extent.

Author citation network is directed and weitheted. It is a network formed from

citation network via a particular method. We implenment ‘weighted PageRank

algorithm’ so as to get the ranked list of authors. In additon, we bulid a paper-author

graph to illustrate the correctness of our network and algorithm.

According to the similarity of the social network and co-author network, we

implement the method of dealing with the latter one to the former and get the ideal

results.


8.2 Further Discussion

Science

Our methods are based on the works of many predecessors and the data come

from real life. As we already mentioned above, centrality indicators and diffusion-

based processes are wildly used in various kinds of complex networks, and proved to

be well-performed. The results of evaluating node influence in the networks via both

centrality measure and diffusion-based algorithm such as PageRank are highly

consistent. Furthermore, the results have empirical support, which is reflected in top-

ranked papers have more citations and top-ranked authors have more masterpieces

and more citations. Therefore, our methods are scientific.

Understanding

The main idea of our methods comes from real life. Our main idea is the

importance of a node is depends on the quality and quantity of nodes who accept it.

The relationship between nodes (papers or authors) is presented as collaboration in

the co-author network and citation in the paper citation network and author citation

network. Actually, in the real world, we use the same idea to define the influence of

individuals or works as well. Therefore, it is natural for us to understand this idea.

Utility

In various different networks such as traffic networks, business networks and

social networks, evaluating the influence of nodes is a significant task for us to make

better decisions. By using our methods in Facebook network, we know our methods

are universally applicable in “small world” networks, which is common in the real

world. That is to say, our methods can be wildly used in many different fields.

Applications Examples

Based on the network of commercial collaborations, companies can look for

suitable and excellent partners.

Based on the network of recommendation (workers recommend co-worker as the

leader), organization can select appropriate leaders.

Government can make some protection decisions by analyzing the importance of

nodes within electricity networks, aviation networks, and transport networks.

Individual can build effective social relationship which can promote personal

development by analyzing social network to know important people.

9. Reference

[1] Otte E, Rousseau R. Social network analysis: a powerful strategy, also for the

information sciences [J]. Journal of Information Science, 2002, 28(6): 441-453.


[2] Abbasi A, Altmann J, Hossain L. Identifying the effects of co-authorship networks

on the performance of scholars: A correlation and regression analysis of performance

measures and social network analysis measures [J]. Journal of Informetrics, 2011, 5(4):

594-607.

[3] Martins M E, Martins G S, Csillag J M, et al. Service's scientific community: a social

network analysis (1995-2010) [J]. Journal of Service Management, 2012, 23(3): 455-469.

[4] Ding Y, Yan E, Frazho A, et al. PageRank for ranking authors in co‐citation

networks[J]. Journal of the American Society for Information Science and Technology,

2009, 60(11): 2229-2243.

[5] Lehmann S, Jackson A D, Lautrup B E. Measures for measures [J]. Nature, 2006,

444(7122): 1003-1004.

[6] Hirsch, JE. 2005. "An index to quantify an individual's scientific research output."

Proceedings of the National Academy of Sciences 102(46):16569

[7] Egghe L. Theory and practise of the g-index[J]. Scientometrics, 2006, 69(1): 131-152.

[8] Nerur S, Sikora R, Mangalaraj G, et al. Assessing the relative influence of journals

in a citation network[J]. Communications of the ACM, 2005, 48(11): 71-74.

[9] Delgado E, Repiso R. The Impact of Scientific Journals of Communication:

Comparing Google Scholar Metrics, Web of Science and Scopus[J]. Comunicar, 2013,

21(41).

[10] Freire V P, Figueiredo D R. Ranking in collaboration networks using a group

based metric[J]. Journal of the Brazilian Computer Society, 2011, 17(4): 255-266.

[11] Watts D J, Strogatz S H. Collective dynamics of ‘small-world’networks[J]. Nature,

1998, 393(6684): 440-442.

[12] Page L, Brin S, Motwani R, et al. The PageRank citation ranking: bringing order to

the web [J]. 1999.

[13] Mihalcea R. Graph-based ranking algorithms for sentence extraction, applied to

text summarization[C]//Proceedings of the ACL 2004 on Interactive poster and

demonstration sessions. Association for Computational Linguistics, 2004: 20.

[14] Frank Harary, Wikipedia, http://en.wikipedia.org/wiki/Frank_Harary

[15] L.C. Freeman, Centrality in social networks: I. Conceptual clarification, Social

Networks1 (1978) 215–239

[16] Scopus databases, http://www.scopus.com/search/form.url?display=authorLoo

Kup&clear=t&origin=searchbasic&txGid=EB1D2BA09CA152AF9BDA11F7EFFC08B4.

Vdktg6RVtMfaQJ4pNTCQ%3a2

[17] Chen D, Lü L, Shang M S, et al. Identifying influential nodes in complex

networks[J]. Physica A: Statistical Mechanics and its Applications, 2012, 391(4): 1777-

1787.

[18] Radicchi F, Fortunato S, Markines B, et al. Diffusion of scientific credits and the

ranking of scientists[J]. Physical Review E, 2009, 80(5): 056103.

[19] Stanford Network Analysis Project. http://snap.stanford.edu/data/egonets-

Facebook.html

1. Introduction2. Assumptions3. Co-author Network of the Erdös1 Authors3.3 Co-author Influence Measur

Documents

29080Team #29080 page 2 of 18 In the co-author network of Erdös1, ignore the frequency of collaboration. To a certain degree, the citations can be a pretty good measure for the influence