31
Challenges in Ranking of Universities First International Conference on World Class Universities, Jaio Tong University, Shanghai, June 16-18, 2005 Anthony F.J. van Raan Center for Science and technology Studies(CWTS) Leiden University

Challenges in Ranking of Universities Raan.pdf · Challenges in Ranking of Universities ... We test our model with empirical ... Correlation between Expert Scores with Citation-Analysis

Embed Size (px)

Citation preview

Challenges in Ranking of Universities

First International Conference on World Class Universities, Jaio Tong

University, Shanghai, June 16-18, 2005

Anthony F.J. van Raan

Center for Science and technology Studies(CWTS)

Leiden University

One Basic Question:

How can we identify the best

universities in the world?

6 Research Questions:

1. Research or Teaching?

2. How to Measure Performance?

3. For all universities in the world?

4. One numerical value?

5. Significance of positions?

6. How many?

Two most influential international

rankings:

Shanghai Jiao Tong University (60% bibliom.)

Times Higher Education Supplement (20% bibliom.)

Important national rankings:

Germany:

CHE

DFG

We can ask experts for their judgment……

VOLUME 88, Number 13 PHYSICAL REVIEW LETTERS 1 April 2002

Truncation of Power Law Behavior in “Scale-Free” Network Models

due to Information Filtering

Stefano Mossa,1,2 Marc Barthélémy,3 H. Eugene Stanley,1 and Luís A. Nunes Amaral1

1 Center for Polymer Studies and Department of Physics, Boston University, Boston, Massachusetts 02215

2 Dipartimento di Fisica, INFM UdR, and INFM Center for Statistical Mechanics and Complexity,

Universitàdi Roma “La Sapienza,” Piazzale Aldo Moro 2, I-00185, Roma, Italy

3 CEA-Service de Physique de la Matière Condensée, BP 12, 91680 Bruyères-le-Châtel, France

(Received 18 October 2001; published 14 March 2002)

We formulate a general model for the growth of scale-free networks under filtering information

conditions—that is, when the nodes can process information about only a subset of the existing nodes in the

network. We find that the distribution of the number of incoming links to a node follows a universal scaling

form, i.e., that it decays as a power law with an exponential truncation controlled not only by the system size

but also by a feature not previously considered, the subset of the network ―accessible‖ to the node. We test our

model with empirical data for the World Wide Web and find agreement.

DOI: 10.1103/PhysRevLett.88.138701 PACS numbers: 89.20.Hh, 84.35.+i, 89.75.Da, 89.75.Hc

There is a great deal of current interest in understanding the structure and growth mechanisms of global networks [1–3], such as the World Wide

Web (WWW) [4,5] and the Internet [6]. Network structure is critical in many contexts such as Internet attacks [2], spread of an Email virus [7], or

dynamics of human epidemics [8]. In all these problems, the nodes with the largest number of links play an important role on the dynamics of the

system. It is therefore important to know the global structure of the network as well as its precise distribution of the number of links.

Recent empirical studies report that both the Internet and the WWW have scale-free properties; that is, the number of incoming links and the

number of outgoing links at a given node have distributions that decay with power law tails [4–6]. It has been proposed [9] that the scale-free

structure of the Internet and the WWW may be explained by a mechanism referred to as ―preferential attachment‖ [10] in which new nodes link

to existing nodes with a probability proportional to the number of existing links to these nodes. Here we focus on the stochastic character of the

preferential attachment mechanism, which we understand in the following way: New nodes want to connect to the existing nodes with the largest

number of links—i.e., with the largest degree—because of the advantages offered by being linked to a well-connected node. For a large network it

is not plausible that a new node will know the degrees of all existing nodes, so a new node must make a decision on which node to connect with

based on what information it has about the state of the network. The preferential attachment mechanism then comes into play as nodes with a

larger degree are more likely to become known.

VOLUME 88, Number 13 PHYSICAL REVIEW LETTERS 1 April 2002

Truncation of Power Law Behavior in “Scale-Free” Network Models

due to Information Filtering

Stefano Mossa,1,2 Marc Barthélémy,3 H. Eugene Stanley,1 and Luís A. Nunes Amaral1

1 Center for Polymer Studies and Department of Physics, Boston University, Boston, Massachusetts 02215

2 Dipartimento di Fisica, INFM UdR, and INFM Center for Statistical Mechanics and Complexity,

Universitàdi Roma “La Sapienza,” Piazzale Aldo Moro 2, I-00185, Roma, Italy

3 CEA-Service de Physique de la Matière Condensée, BP 12, 91680 Bruyères-le-Châtel, France

(Received 18 October 2001; published 14 March 2002)

We formulate a general model for the growth of scale-free networks under filtering information

conditions—that is, when the nodes can process information about only a subset of the existing nodes in the

network. We find that the distribution of the number of incoming links to a node follows a universal scaling

form, i.e., that it decays as a power law with an exponential truncation controlled not only by the system size

but also by a feature not previously considered, the subset of the network ―accessible‖ to the node. We test our

model with empirical data for the World Wide Web and find agreement.

DOI: 10.1103/PhysRevLett.88.138701 PACS numbers: 89.20.Hh, 84.35.+i, 89.75.Da, 89.75.Hc

There is a great deal of current interest in understanding the structure and growth mechanisms of global networks [1–3], such as the World Wide

Web (WWW) [4,5] and the Internet [6]. Network structure is critical in many contexts such as Internet attacks [2], spread of an Email virus [7], or

dynamics of human epidemics [8]. In all these problems, the nodes with the largest number of links play an important role on the dynamics of the

system. It is therefore important to know the global structure of the network as well as its precise distribution of the number of links.

Recent empirical studies report that both the Internet and the WWW have scale-free properties; that is, the number of incoming links and the

number of outgoing links at a given node have distributions that decay with power law tails [4–6]. It has been proposed [9] that the scale-free

structure of the Internet and the WWW may be explained by a mechanism referred to as ―preferential attachment‖ [10] in which new nodes link

to existing nodes with a probability proportional to the number of existing links to these nodes. Here we focus on the stochastic character of the

preferential attachment mechanism, which we understand in the following way: New nodes want to connect to the existing nodes with the largest

number of links—i.e., with the largest degree—because of the advantages offered by being linked to a well-connected node. For a large network it

is not plausible that a new node will know the degrees of all existing nodes, so a new node must make a decision on which node to connect with

based on what information it has about the state of the network. The preferential attachment mechanism then comes into play as nodes with a

larger degree are more likely to become known.

VOLUME 88, Number 13 PHYSICAL REVIEW LETTERS 1 April 2002

Truncation of Power Law Behavior in “Scale-Free” Network Models

due to Information Filtering

Stefano Mossa,1,2 Marc Barthélémy,3 H. Eugene Stanley,1 and Luís A. Nunes Amaral1

1 Center for Polymer Studies and Department of Physics, Boston University, Boston, Massachusetts 02215

2 Dipartimento di Fisica, INFM UdR, and INFM Center for Statistical Mechanics and Complexity,

Universitàdi Roma “La Sapienza,” Piazzale Aldo Moro 2, I-00185, Roma, Italy

3 CEA-Service de Physique de la Matière Condensée, BP 12, 91680 Bruyères-le-Châtel, France

(Received 18 October 2001; published 14 March 2002)

We formulate a general model for the growth of scale-free networks under filtering information

conditions—that is, when the nodes can process information about only a subset of the existing nodes in the

network. We find that the distribution of the number of incoming links to a node follows a universal scaling

form, i.e., that it decays as a power law with an exponential truncation controlled not only by the system size

but also by a feature not previously considered, the subset of the network ―accessible‖ to the node. We test our

model with empirical data for the World Wide Web and find agreement.

DOI: 10.1103/PhysRevLett.88.138701 PACS numbers: 89.20.Hh, 84.35.+i, 89.75.Da, 89.75.Hc

There is a great deal of current interest in understanding the structure and growth mechanisms of global networks [1–3], such as the World Wide

Web (WWW) [4,5] and the Internet [6]. Network structure is critical in many contexts such as Internet attacks [2], spread of an Email virus [7], or

dynamics of human epidemics [8]. In all these problems, the nodes with the largest number of links play an important role on the dynamics of the

system. It is therefore important to know the global structure of the network as well as its precise distribution of the number of links.

Recent empirical studies report that both the Internet and the WWW have scale-free properties; that is, the number of incoming links and the

number of outgoing links at a given node have distributions that decay with power law tails [4–6]. It has been proposed [9] that the scale-free

structure of the Internet and the WWW may be explained by a mechanism referred to as ―preferential attachment‖ [10] in which new nodes link

to existing nodes with a probability proportional to the number of existing links to these nodes. Here we focus on the stochastic character of the

preferential attachment mechanism, which we understand in the following way: New nodes want to connect to the existing nodes with the largest

number of links—i.e., with the largest degree—because of the advantages offered by being linked to a well-connected node. For a large network it

is not plausible that a new node will know the degrees of all existing nodes, so a new node must make a decision on which node to connect with

based on what information it has about the state of the network. The preferential attachment mechanism then comes into play as nodes with a

larger degree are more likely to become known.

VOLUME 88, Number 13 PHYSICAL REVIEW LETTERS 1 April 2002

Truncation of Power Law Behavior in “Scale-Free” Network Models

due to Information Filtering

Stefano Mossa,1,2 Marc Barthélémy,3 H. Eugene Stanley,1 and Luís A. Nunes Amaral1

1 Center for Polymer Studies and Department of Physics, Boston University, Boston, Massachusetts 02215

2 Dipartimento di Fisica, INFM UdR, and INFM Center for Statistical Mechanics and Complexity,

Universitàdi Roma “La Sapienza,” Piazzale Aldo Moro 2, I-00185, Roma, Italy

3 CEA-Service de Physique de la Matière Condensée, BP 12, 91680 Bruyères-le-Châtel, France

(Received 18 October 2001; published 14 March 2002)

We formulate a general model for the growth of scale-free networks under filtering information

conditions—that is, when the nodes can process information about only a subset of the existing nodes in the

network. We find that the distribution of the number of incoming links to a node follows a universal scaling

form, i.e., that it decays as a power law with an exponential truncation controlled not only by the system size

but also by a feature not previously considered, the subset of the network ―accessible‖ to the node. We test our

model with empirical data for the World Wide Web and find agreement.

DOI: 10.1103/PhysRevLett.88.138701 PACS numbers: 89.20.Hh, 84.35.+i, 89.75.Da, 89.75.Hc

There is a great deal of current interest in understanding the structure and growth mechanisms of global networks [1–3], such as the World Wide

Web (WWW) [4,5] and the Internet [6]. Network structure is critical in many contexts such as Internet attacks [2], spread of an Email virus [7], or

dynamics of human epidemics [8]. In all these problems, the nodes with the largest number of links play an important role on the dynamics of the

system. It is therefore important to know the global structure of the network as well as its precise distribution of the number of links.

Recent empirical studies report that both the Internet and the WWW have scale-free properties; that is, the number of incoming links and the

number of outgoing links at a given node have distributions that decay with power law tails [4–6]. It has been proposed [9] that the scale-free

structure of the Internet and the WWW may be explained by a mechanism referred to as ―preferential attachment‖ [10] in which new nodes link

to existing nodes with a probability proportional to the number of existing links to these nodes. Here we focus on the stochastic character of the

preferential attachment mechanism, which we understand in the following way: New nodes want to connect to the existing nodes with the largest

number of links—i.e., with the largest degree—because of the advantages offered by being linked to a well-connected node. For a large network it

is not plausible that a new node will know the degrees of all existing nodes, so a new node must make a decision on which node to connect with

based on what information it has about the state of the network. The preferential attachment mechanism then comes into play as nodes with a

larger degree are more likely to become known.

VOLUME 88, Number 13 PHYSICAL REVIEW LETTERS 1 April 2002

Truncation of Power Law Behavior in “Scale-Free” Network Models

due to Information Filtering

Stefano Mossa,1,2 Marc Barthélémy,3 H. Eugene Stanley,1 and Luís A. Nunes Amaral1

1 Center for Polymer Studies and Department of Physics, Boston University, Boston, Massachusetts 02215

2 Dipartimento di Fisica, INFM UdR, and INFM Center for Statistical Mechanics and Complexity,

Universitàdi Roma “La Sapienza,” Piazzale Aldo Moro 2, I-00185, Roma, Italy

3 CEA-Service de Physique de la Matière Condensée, BP 12, 91680 Bruyères-le-Châtel, France

(Received 18 October 2001; published 14 March 2002)

We formulate a general model for the growth of scale-free networks under filtering information

conditions—that is, when the nodes can process information about only a subset of the existing nodes in the

network. We find that the distribution of the number of incoming links to a node follows a universal scaling

form, i.e., that it decays as a power law with an exponential truncation controlled not only by the system size

but also by a feature not previously considered, the subset of the network ―accessible‖ to the node. We test our

model with empirical data for the World Wide Web and find agreement.

DOI: 10.1103/PhysRevLett.88.138701 PACS numbers: 89.20.Hh, 84.35.+i, 89.75.Da, 89.75.Hc

There is a great deal of current interest in understanding the structure and growth mechanisms of global networks [1–3], such as the World Wide

Web (WWW) [4,5] and the Internet [6]. Network structure is critical in many contexts such as Internet attacks [2], spread of an Email virus [7], or

dynamics of human epidemics [8]. In all these problems, the nodes with the largest number of links play an important role on the dynamics of the

system. It is therefore important to know the global structure of the network as well as its precise distribution of the number of links.

Recent empirical studies report that both the Internet and the WWW have scale-free properties; that is, the number of incoming links and the

number of outgoing links at a given node have distributions that decay with power law tails [4–6]. It has been proposed [9] that the scale-free

structure of the Internet and the WWW may be explained by a mechanism referred to as ―preferential attachment‖ [10] in which new nodes link

to existing nodes with a probability proportional to the number of existing links to these nodes. Here we focus on the stochastic character of the

preferential attachment mechanism, which we understand in the following way: New nodes want to connect to the existing nodes with the largest

number of links—i.e., with the largest degree—because of the advantages offered by being linked to a well-connected node. For a large network it

is not plausible that a new node will know the degrees of all existing nodes, so a new node must make a decision on which node to connect with

based on what information it has about the state of the network. The preferential attachment mechanism then comes into play as nodes with a

larger degree are more likely to become known.

VOLUME 88, Number 13 PHYSICAL REVIEW LETTERS 1 April 2002

Truncation of Power Law Behavior in “Scale-Free” Network Models

due to Information Filtering

Stefano Mossa,1,2 Marc Barthélémy,3 H. Eugene Stanley,1 and Luís A. Nunes Amaral1

1 Center for Polymer Studies and Department of Physics, Boston University, Boston, Massachusetts 02215

2 Dipartimento di Fisica, INFM UdR, and INFM Center for Statistical Mechanics and Complexity,

Universitàdi Roma “La Sapienza,” Piazzale Aldo Moro 2, I-00185, Roma, Italy

3 CEA-Service de Physique de la Matière Condensée, BP 12, 91680 Bruyères-le-Châtel, France

(Received 18 October 2001; published 14 March 2002)

We formulate a general model for the growth of scale-free networks under filtering information

conditions—that is, when the nodes can process information about only a subset of the existing nodes in the

network. We find that the distribution of the number of incoming links to a node follows a universal scaling

form, i.e., that it decays as a power law with an exponential truncation controlled not only by the system size

but also by a feature not previously considered, the subset of the network ―accessible‖ to the node. We test our

model with empirical data for the World Wide Web and find agreement.

DOI: 10.1103/PhysRevLett.88.138701 PACS numbers: 89.20.Hh, 84.35.+i, 89.75.Da, 89.75.Hc

There is a great deal of current interest in understanding the structure and growth mechanisms of global networks [1–3], such as the World Wide

Web (WWW) [4,5] and the Internet [6]. Network structure is critical in many contexts such as Internet attacks [2], spread of an Email virus [7], or

dynamics of human epidemics [8]. In all these problems, the nodes with the largest number of links play an important role on the dynamics of the

system. It is therefore important to know the global structure of the network as well as its precise distribution of the number of links.

Recent empirical studies report that both the Internet and the WWW have scale-free properties; that is, the number of incoming links and the

number of outgoing links at a given node have distributions that decay with power law tails [4–6]. It has been proposed [9] that the scale-free

structure of the Internet and the WWW may be explained by a mechanism referred to as ―preferential attachment‖ [10] in which new nodes link

to existing nodes with a probability proportional to the number of existing links to these nodes. Here we focus on the stochastic character of the

preferential attachment mechanism, which we understand in the following way: New nodes want to connect to the existing nodes with the largest

number of links—i.e., with the largest degree—because of the advantages offered by being linked to a well-connected node. For a large network it

is not plausible that a new node will know the degrees of all existing nodes, so a new node must make a decision on which node to connect with

based on what information it has about the state of the network. The preferential attachment mechanism then comes into play as nodes with a

larger degree are more likely to become known.

…or let scientific output and its impact speak:

bibliometric analysis

PC

P(w)

C(w)

CPP

Research group

P

FCSP(w,f)

Whole world, relevant field(s)

C(w)

Books

Reports

Book chapters Conf. Proceed.

Journal articles

within CI

…. and also field-specific!

Correlation between impact and rankingWorldwide top-universities in life/biomedical sciences

1.00

10.00

100.00

1 10 100 1000

r

CPP

P 2001-2004 C 2001-2004 CPP/FCSm

USA 994,650 0.270 3,747,932 0.380 1.38

JAPAN 278,420 0.076 605,876 0.062 0.86

GREAT BRITAIN 270,517 0.073 851,704 0.086 1.22

GERMANY 251,365 0.068 743,582 0.075 1.12

FRANCE 180,145 0.049 490,137 0.050 1.05

PEOPLES R CHINA 162,771 0.044 174,316 0.018 0.57

ITALY 132,091 0.036 320,773 0.033 0.95

CANADA 131,469 0.036 382,198 0.039 1.18

SPAIN 94,005 0.026 197,001 0.020 0.91

RUSSIA 91,749 0.025 83,335 0.008 0.43

AUSTRALIA 83,675 0.023 213,928 0.022 1.07

NETHERLANDS 75,903 0.021 251,170 0.025 1.27

SOUTH KOREA 70,878 0.019 101,548 0.010 0.78

INDIA 68,685 0.019 70,965 0.007 0.46

SWEDEN 59,144 0.016 187,035 0.019 1.15

SWITZERLAND 54,484 0.015 215,893 0.022 1.39

BRAZIL 46,005 0.012 59,758 0.006 0.6

TAIWAN 45,948 0.012 60,273 0.006 0.72

POLAND 42,490 0.012 54,078 0.005 0.6

BELGIUM 40,916 0.011 116,947 0.012 1.15

3,681,790 9,850,029

Expert Survey Problems:(methodological)

1. Biases: geographical, field-specific

2. Responding > Non-responding characteristics

3. Sample size > reliability of measurement

4. Nomination procedure

5. Scaling procedure

6. Controlling variables

7. Standard deviation scores

8. Statistical significance

Correlation between Expert Scores with Citation-Analysis Based Scores

THE Ranking 2004

y = 53.985x0.0397

R2 = 0.005

1

10

100

1000

0.1 1 10 100 1000

C

E

…so we have some problems here…..

Bibliometric Analysis Problems:

1. Technical

2. Methodological

1. Technical problems:

- citing-cited mismatches

- definition & unification of institutions(specific responsibility)

2. Methodological Problems:

- Field definition

- Field-normalization of citation counts

- Black box indicators

- Highly cited scientists > highly cited article

- Article-type normalization of citation counts

- US bias

- Language bias (Germany: 25%!)

- Engineering, Social Sciences, Humanities

- Same data, same methodology, different rankings

New Approaches:

- Iteration of Expert Survey focused on top

- Output-specific analysis engineering fields, social

science and humanities

- Top-10% bibliometric analysis

Top-10% Approach:

1. Identify universities with P > 200/y (N ~ 250)

2. Collect all publication so these universities

3. Ranking by:

- entire oeuvre

- top-10% of the oeuvre

in both cases: CPP and CPP/FCSm

- CPP/FCSm(top) x P(top)

Outlook:

- Improved ranking procedures will further

de-equalize universities and reinforce a scientific elite

league

- Excessive evaluation hypes will les to science

destruction

- Balance has to be found by data-system improvement

and automation of advanced bibliometric assessment

procedures

Characteristics of a successful

university in a bibliometric approach

Research profile

Output and impact per field2000 - 2003

Leiden University

0 1 2 3 4 5 6 7

ASTRON & ASTROPH (1.38)

BIOCH & MOL BIOL (0.96)

ONCOLOGY (1.05)

IMMUNOLOGY (1.22)

HEMATOLOGY (1.27)

GENETICS & HERED (1.48)

PHARMACOL & PHAR (1.11)

PHYSICS,MULTIDIS (1.84)

PHYSICS, COND MA (1.21)

ENDOCRIN & METAB (0.99)

MEDICINE,GENERAL (3.35)

RAD,NUCL MED IM (1.04)

CHEM, PHYSICAL (1.00)

CARD & CARD SYST (0.95)

RHEUMATOLOGY (1.75)

CLIN NEUROLOGY (1.72)

NEUROSCIENCES (0.86)

CHEM, INORG&NUC (1.82)

PHYSICS, AT,M,C (0.87)

PERIPHL VASC DIS (1.10)

CELL BIOLOGY (1.05)

MULTIDISCIPL SC (1.31)

CHEM, ORGANIC (1.02)

PLANT SCIENCES (1.04)

PATHOLOGY (1.56)

SURGERY (1.34)

CHEMISTRY (1.60)

COMPU SCI,THEORY (1.05)

PEDIATRICS (1.56)

FIELD

(CPP/FCSm)

Share of the output (%)

IMPACT: LOW AVERAGE HIGH

>50% above, and no field below

internat. average

0

5,000

10,000

15,000

20,000

25,000

30,000

35,000

40,000

45,000

50,000

1980 -

1983

1981 -

1984

1982 -

1985

1983 -

1986

1984 -

1987

1985 -

1988

1986 -

1989

1987 -

1990

1988 -

1991

1989 -

1992

1990 -

1993

1991 -

1994

1992 -

1995

1993 -

1996

1994 -

1997

1995 -

1998

1996 -

1999

1997 -

2000

1998 -

2001

1999 -

2002

2000 -

2003

2001 -

2004

P

C+sc

Research output from Shanghai (PRC)

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1980 -

1983

1981 -

1984

1982 -

1985

1983 -

1986

1984 -

1987

1985 -

1988

1986 -

1989

1987 -

1990

1988 -

1991

1989 -

1992

1990 -

1993

1991 -

1994

1992 -

1995

1993 -

1996

1994 -

1997

1995 -

1998

1996 -

1999

1997 -

2000

1998 -

2001

1999 -

2002

2000 -

2003

2001 -

2004

CPP/JCSm

CPP/FCSm

JCSm/FCSm

Normalized impact scores

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

1980 -

1983

1981 -

1984

1982 -

1985

1983 -

1986

1984 -

1987

1985 -

1988

1986 -

1989

1987 -

1990

1988 -

1991

1989 -

1992

1990 -

1993

1991 -

1994

1992 -

1995

1993 -

1996

1994 -

1997

1995 -

1998

1996 -

1999

1997 -

2000

1998 -

2001

1999 -

2002

2000 -

2003

2001 -

2004

% P not cited

% Self-citations

Percentages 'Publications not cited' and 'self-citations'

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1980 -

1983

1981 -

1984

1982 -

1985

1983 -

1986

1984 -

1987

1985 -

1988

1986 -

1989

1987 -

1990

1988 -

1991

1989 -

1992

1990 -

1993

1991 -

1994

1992 -

1995

1993 -

1996

1994 -

1997

1995 -

1998

1996 -

1999

1997 -

2000

1998 -

2001

1999 -

2002

2000 -

2003

2001 -

2004

Single address

National

International

Field normalized impact scores for scientific cooperation types

0% 20% 40% 60% 80% 100%

1980 - 1983

1981 - 1984

1982 - 1985

1983 - 1986

1984 - 1987

1985 - 1988

1986 - 1989

1987 - 1990

1988 - 1991

1989 - 1992

1990 - 1993

1991 - 1994

1992 - 1995

1993 - 1996

1994 - 1997

1995 - 1998

1996 - 1999

1997 - 2000

1998 - 2001

1999 - 2002

2000 - 2003

2001 - 2004

Single address

National

International

Growing (inter)national scientific cooperation

Thank you for you attention

and thanks to the Institute of Higher Education,

Shanghai Jiao Tong University for organizing a

first conference on this very hot topic of ranking