Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Google PageRank to the rescue of bibliometrics?
JL Foulley (IMAG) in collaboration with Gilles Celeux (INRIA-Select, Paris-Orsay), Simon Grah (M2, Data Science, X Telecom, Paris) & Julie Josse (CMAP, Paris-Saclay),
06/12/2016 JL Foulley- Applibugs 1
Content
• Context
•Journal Impact Factor
•Google PageRank
Journal Influence Indicators derived from P-Rank•Journal Influence Indicators derived from P-Rank
•Example of Statistical Journals
•Related works and extensions
•Discussion
06/12/2016 2JL Foulley- Applibugs
Context
• Great demand for « objective » measures of productivity, relevance and value of scientific works from the different levels of society
• Historically measure of journal influence came first
• Journal Impact Factor based on average number of citations per article published in peer reviewed WOS publications during the 2 previous years
• Objections against JIF (& bibliometric methods)
06/12/2016 3JL Foulley- Applibugs
Context/Objections
•Epistemological
�Peer review & expertise examination should prevail
�Quality of research is not based on a citation window
•Ethical
�Misuse, abuse and manipulation
•Socio-political•Socio-political
�JCR reshaped scientist behaviour, evaluation procedures,
budget allocation and international science policy
�Indicators powered by one (or two) private companies
•Technical
�Two-year time window
�Self citations
�Negative and positive appreciations joined together
• see Archambault & Larivière (2009), Seglen (1992,1997)
06/12/2016 4JL Foulley- Applibugs
Data Bases & Citation Metrics
• Journal of Citation Reports (Thomson-Reuters)
�11365 journals, 234 disciplines
�Impact Factor (IF)
�EigenFactor (EF)
�Article Influence (AI)
• Scopus Metrics (Elsevier)
�21500 peer-reviewed (4200 open access), 116000 books
06/12/2016 5JL Foulley- Applibugs
�21500 peer-reviewed (4200 open access), 116000 books
�SCImago Journal Rank (SJR)
�Impact per Publication (IPP)
• Google Scholar Metrics
�H-index
• Altmetrics
�Altmetric Attention Score
(News, Blogs, Twitter, Sina Weibo, Facebook, YouTube, LinkedIn)
Journal Impact Factor (JIF)
•Origins of Journal Impact Factor: Archambault & Larivière (2009)
�Pioneer work of Gross and Gross (1927) to help US Librarians in journal selection
�« to measure the desirability of purchasing a particular journal »
�tabulate the references to a particular journal ‘The Journal of the American Chemical Society’
�Hack (1936), Raisig (1960) ratio of no of quotes by no of articles published
�Martyn & Gilchrist (1967) two years citation window
•Eugene Garfield, Founder of the Institute of Scientific Information (ISI),
�First published JIF in 1975 and promoted it
�Flagship of Thomson-Reuters via Web of Science & Journal of Citation Reports (JCR)
06/12/2016 6JL Foulley- Applibugs
JIF/Continued
•Defined as
�IF(A)=N/D
�N=No of cites in 2015 to items published by A in the 2 previous yrs (ie 2014,2013)
�D=No of articles (citable items) published by A in the 2 previous years
Example•Example
• Biometrika
06/12/2016 7JL Foulley- Applibugs
Factors influencing JIF/Discipline
•Field dependent criterion
�Medicine vs Probability & Statistics
4.105=178923/43598 1.128=19034/16872
06/12/2016 8JL Foulley- Applibugs
Factors influencing JIF/Type of paper
�Reviews vs Research articles
Nature Review Immunology Immunity
39.416=4927/125 24.082=7971/331
Statistical Science BiometricsStatistical Science Biometrics
2.213=166/75 1.360=287/211
�Theoretical vs Applied
JRSS-B JRSS-C
4.222=304/72 1.354=107/79
06/12/2016 9JL Foulley- Applibugs
Factors influencing JIF/Self Citations
�Included in IF
JRSS-B
IF=4.222=304/72 IF’=4.097=295/72
Self Citations=9/304=3% Total: 111/17404=0.6%
ScientometricsScientometrics
IF=2.084=1236/593 IF’=1.278=758/593
Self Citatins=478/1236=38% Total: 1963/6436=30%
�Coercition
« To ensure that a submitted manuscript meets sufficient interest of the readership it is expected that articles recently published in the Journal of…..are cited to a reasonable extent »
06/12/2016 10JL Foulley- Applibugs
Factors influencing JIF/Time windows
JRSS-B: IF2=(153+151)/(38+34)=304/72=4.222
Distribution of citations received by the journal according to years of publication
IF5=(153+151+142+431+302)/(38+34+32+29+29)=1179/162=7.278
Biometrika: IF2=1.130 IF5=2.016
06/12/2016 11JL Foulley- Applibugs
Factors influencing JIF/Time windows
BIOINFORMATICS: IF2=(4200+3757)/(731+649)=7957/1380=5.766
Distribution of citations received by the journal according to years of publication
IF5=(153+151+142+431+302)/(38+34+32+29+29)=1179/162=7.278
IF5=(4200+3757+….+8012)/(731+649+….+710)=27013/3515=7.685
PNAS: IF2=9.423 IF5=10.285
06/12/2016 12JL Foulley- Applibugs
Factors influencing JIF/Time windows
CA- A Cancer Journal for Clinicians:
IF2=(3655+2536)/(24+21)=6191/45=137.578
Distribution of citations received by the journal according to years of publication
IF5=(3655+…+1309)/(24+…+18)=14647/101=145.020
06/12/2016 13JL Foulley- Applibugs
Technical issues with JIF
•Field dependent criterion
�Medicine vs Probability & Statistics
•Type of paper
�Review vs Research articles
•Self Citations
�Included
•Time window•Time window
�Standard 2 yrs (« accidental choice »), 5 yrs also published
�Half-Life of cited papers quite longer (>10 yrs in stat) (Garfield & Sher, 1963)
•Pareto Principle (highly skewed distribution of citations)
�90 % IF of Nature due to 25% papers
•Equal weight to all citations
�Does not depend of the importance of the citing journal
06/12/2016 14JL Foulley- Applibugs
Technical issues with JIF
•Asymmetry of the items valid for the numerator & denominator of JIF
�Numerator : any type of material
�Denominator: only citable items with meeting abstracts, editorials, letters,
corrections, book reviews, biographical items, theses, technical reports, patents
excluded
�« Non citable items are not uncitable » , they are cited, counted in the numerator but
excluded from the denominator (Liu et al, 2016) excluded from the denominator (Liu et al, 2016)
�Lancet IF reduced by 40% when just counting citations from citable items (Moed et
al, 1995)
�Corrections needed for journals publishing a high proportion of non-citable items eg
letters & editorials
�Consistency in the source items of N & D in Scopus
06/12/2016 15JL Foulley- Applibugs
•Name Google: misspelling of Googol=10^100
•Headquarters in CA, Mountain View, Googleplex 10^Googol
•Cofounded by Larry Page & Sergey Brin in 1998
•Holding company « Alphabet » offering many products
06/12/2016 16JL Foulley- Applibugs
�Google Search (Search Engine)
�Google Chrome (web browser
�Gmail (E mail)
�Google Drive (Cloud Storage)
�You Tube (video)
�Google translate
�Google maps (web mapping)
�Adsense (media advertisements)
�Android (mobile OS)
�Google Hangouts (video communication)
�Google Fiber (broadbrand internet and cable television access)
�Nexus (tablets)
�Google Pixel (smartphone)
�Google cars (self driving)
�Waze (GPS based geographical navigation)
06/12/2016 17JL Foulley- Applibugs
Web Search Engines
• GOOGLE 71.1%
• BING 10.6%
BAIDU 8.7%• BAIDU 8.7%
• YAHOO 7.5%
06/12/2016 JL Foulley- Applibugs 18
06/12/2016 19JL Foulley- Applibugs
S
User
www
Modules
Query Module Ranking Module
Which pages use
the query term?Ranks relevant pages
on scores (content,
popularity)
Global Ranking of
Pages
Scheme of a Web Search Engine Machine
(from Langville & Meyer, 2012)
06/12/201 20JL Foulley- Applibugs
Query independence
�PageRank is based on link structure to capture « importance » of
webpage
�Pages (j,k,l) pointing to i =inlinks to i & outlinks from j,k,l
�Pages having high PR (k) are given more weight than j,l
�Pages (k) linking to other pages (k) are given less weight than j,l
Basic Page Rank/Principles
ik
j
l
06/12/2016 21JL Foulley- Applibugs
Brin & Page’s 1998 article
06/12/2016 22JL Foulley- Applibugs
Basic Page Rank/Example
06/12/2016 23JL Foulley- Applibugs
Basic Page Rank/Network & Algorithm
r(1)=r(3)/1+r(4)/2
r(2)=r(1)/3
r(3)= r(1)/3+ r(2)/2+ r(4)/2
r(4)= r(1)/3 +r(2)/2
1 3
2 4
subset of pages (nodes) pointing to
#outlinks from
page
i
j
i j Bj
i
j
rr
i
d
B
d j
Œ
=
=
=Â
06/12/2016 24JL Foulley- Applibugs
{ } { }{ } { }{ } { }{ } { }
1 2,3, 4
2 3, 4
3 1
4 3,1
Æ
Æ
Æ
Æ
Basic Page Rank/Ex/Solutions
r=(12, 4, 9, 6)/31
06/12/2016 25JL Foulley- Applibugs
Basic Page Rank/Basic theory
� � { }
� [ ]
#outdegrees
,
#in degrees
0 1 1 1 3
0 0 1 1 2
1 0 0 0 1
1 0 1 0 2
2 1 3 2
out
i
AjacencyMatrix
T in T
a +
È ˘ È ˘Í ˙ Í ˙Í ˙ Í ˙= = = =Í ˙ Í ˙Í ˙ Í ˙Î ˚ Î ˚
= =
A A1 d
1 A d
{ } { }{ } { }{ } { }{ } { }
1 2,3, 4
2 3,4
3 1
4 1,3
Æ
Æ
Æ
Æ
� ( )
( ) ( ) ( ) ( )
1
Transition Matrix
( 1) ( )
1 1
1
0 1 / 3 1 / 3 1 / 3
0 0 1 / 2 1 / 2
1 0 0 0
1 / 2 0 1 / 2 0
Pr( | ) /
PageRank algorithm
1, 2,...,
out
n n
ij ij i
Nk T k T k k
j ij ii
Diag
p s j s i a a
r p r j N
-
++
+ +
=
È ˘Í ˙Í ˙= = =Í ˙Í ˙Î ˚
= = = =
= = =Â
P D A D d
r r P
06/12/2016 26JL Foulley- Applibugs
Basic Page Rank/Markov Chain
{ } 1
(1) (0) (2) (1) (0) 2 ( ) (0)
transition matrix
, ,
If (graph strongly connected) and chain
(not visi
irreducible
ted in a number of steps multiple of an integ
ape
er>1)
Uni
riodic
ij
T T T T T n T T n
p-= =
= = = =
P D A
r r P r r P r P r r P
P
( ) (0)que Stationary distribution of the DTMC for any n
nÆ•Ær r r r� �
( ) ( ) ( )( ) ( )1
1
,.., ,..., for any row where
1, 2,...,
=Left eigenvector pertaining t
n
n n n nT n
i ij iN ij ij
NT T
j ij ii
p p p i p
r p r j N
Æ•
=
È ˘Æ = Î ˚
= = =Â
r P
r r P
r
�
� � � �
�
1
long
o the
-term
dominant
fraction
eigenvalue of
(he
of time the process
re 1)
spend
such that
s at page (node)
1
jr j
=
=
P
r�
�
06/12/2016 27JL Foulley- Applibugs
Basic Page Rank/Google Matrix
{ }5 : D ang ling no des: n o o u tlin ks
1 3
2 4
6 7
8
{ }
{ }
5 : D ang ling no des: n o o u tlin ks
E x : ro w 5 w ith all e lem en ts 0
6 ,7 ,8 T rapp ed in to an abso rb in g sub ch ain
5
{ }1
*
Trick for dangling nodes
Replace null rows by / (equal probabilities)
If with 1 if ,0 otherwise
Then /
T
i ii N
T
N
b b i DN
N
£ £= = =
= +
1
b
P P b1
06/12/2016 28JL Foulley- Applibugs
Basic Page Rank/Random Surfer
1 3
24
67
8
4
5
( )*
*
1 /
1)with , the surfer
from the current node (here 2) to another one
probability moves on the network de
(3 or 5) following an edge
according to the transition probabilit
fined by
ie
TN
i j
a
a a= + -G P 11
P
*
1
s ;
2)with , the surfer
More generally with transition probabilities ( ,..., ,.
probability 1- "teleports" at random to any no
., )
damping factor usually set empirically t
d
o 0.85
called the
e
G
ij
j N
p
u u u
ja
a =
G is stochastic, irreducible, aperiodic
PageRank is the the stationary distribution of the corresponding DTMC
defined as and calculated using
oogle
the P
matrix
ower MethodT T=r r G
06/12/2016 29JL Foulley- Applibugs
Basic Page Rank/Calculation
( )( ) ( ) ( )
( ) ( ) ( )( 1)
1
*
*
1 /
1 / Power M: Brin & Page
k k T k T
T
k T k T T
N
N
a a
a
a aa
a
+
+
È ˘= + + -
= + -
= + -
G P 11
r r P 1
( ) ( ) ( )
( )
( 1)
*
: Extremely sparse matrix, never formed nor stored
Alternative by solving a Linear Sys
1 /
1
e
w
m
/
t
k k T k TT
T
T
TN
Na a
a
a
a
+ È ˘= + + -Î
È ˘- = -Î ˚
˚r r P r b
1
1
P
P
x I1
ith /T T=r x x
06/12/2016 30JL Foulley- Applibugs
« Fundamental entity which contains the informatrion describing the flow of influence among units » Pinski & Narin, 1976
Cason & Lubotsky, 1936, Psychological B Daniel & Louttit, 1953, Psychology
Kessler, 1964, Physics Price, 1981, Scientometrics
Todorov & Glänezel, 1988, Review, J of Information Science
Stigler, Stigler & Friedland, 1992, Economics; Stigler, 1994 Varin et al, 2016
Cij=No of references journal i gives in a given period (T1) to papers
Application to Rating & Ranking Scientific
Journals/Cross Citation Matrix
Cij=No of references journal i gives in a given period (T1) to paperspublished previously by journal j within a given target window (T2) ex T2=[t-1,t-2]
In Short Cij: No of cites i gives to j
T1: usually 1yr ex 2015
T2: Target Window: 2 yrs eg 2014,2013 or 5 yrs eg 2014, 2013,2012,2011,2010
Rows: i citing (issuing references)
Columns: j cited (receiving citations)
06/12/2016 31JL Foulley- Applibugs
R &R Journals/Cross Citation Matrix/Ex
Cross Citation Matrix for a subset of 12 journals from Varin et al (2016)
AOS BCS BKA CSSC ENVR ISR JABES JASA JRSS-B JSS STCMP STMOD
AOS 291 15 38 0 0 3 0 121 76 4 5 0
BCS 29 191 60 2 0 0 7 129 45 0 11 3
BKA 74 31 75 0 1 0 0 78 41 2 4 0
CSSC 19 11 11 29 1 3 1 29 12 4 2 4
ENVR 6 8 8 0 31 2 8 20 14 2 1 3
ISR 0 2 5 0 0 20 0 6 4 0 1 0
JABES 2 30 5 0 3 0 16 20 5 2 0 0 JABES 2 30 5 0 3 0 16 20 5 2 0 0
JASA 126 69 93 3 11 5 8 232 101 0 13 2
JRSS-B 38 5 21 0 0 1 0 40 55 2 8 0
JSS 7 13 7 0 2 1 0 13 12 91 1 4
STCMP 34 13 17 1 4 0 1 45 33 0 24 4
SMOD 2 5 9 00 0 1 0 13 8 3 0 7
Citing in rows; Cited in columns
Cij: References given in 2010 by journals i to papers published in journals j during 2001-2010 AOS: Annals of Statistics, BCS: Biometrics, BKA: Biometrika, CSSC: Communication in Statistics-Simulation & Computation, ENVR:
Evironmentrics, ISR: International Statistical Review, JABES: Journal of Agricultural, Biological and Environmental Statistics, JASA: Journal of the
American Statistical Science, JRSS-B: Journal of the Royal Statistical Society, Series B, JSS: Journal of Statistical Software, STCMP: Statistics &
Computing, STMOD: Statistical Modelling
06/12/2016 32JL Foulley- Applibugs
R&R Journals/EigenFactor
j
Cited
i
Citing
Transition Matrix : From citing to cited
/j j ii i
i j
p c c +=
P
No of references issued by during 1 yr to papers of published 5yrs before
No of references issued by during 1 yr to all papers b
/
pu l
i
j j i
i
i i
jc i j
c i
p c c +
+
=
=
=
1 2
1 1
ished 5yrs before
Row i= , ,.., ,...,
, 1
Note : called in Bergstrom (2007) Eigenfactor: measuring the value and
prestige of scholarly journals. College & Research Lib
i i ij iN
N N
i ij ijj j
T
p p p
c c p+ = =
È ˘Î ˚
= =Â ÂP H
raries News, 68, 314-316
06/12/2016 33JL Foulley- Applibugs
R& R Journals/EF & AI
( )( )
1 1
1
*
*
Dangling Nodes: null rows replaced by with /
1 for DN, 0 otherwise
No of articles published by during the 5yr time window
Google Matrix: 1
NTT
i i i ki N k
i ii N
T
i
a a a a
b b b i
a i
a
£ £ =
£ £
= =
= = Œ
= = +
= +
Âa
P P ba
G P
� � �
�
( )
( )Let be the leading Eigenvector vector "Weighted Page Rank"
T
T T
a-
=
 Â
1a
r r G r
�
( )*
1 1
1
1
1
Then defined as
EF def
EigenFactor EF Total influence of j
ined
ournal i
N N
j ij i j ii i
T
i
i N T
k k
r p r a ra a= =
=
È ˘Î ˚
= +
=È ˘Î ˚
-Â Â
ÂP r
P r
����
[ ]Article Inf
not as bu
luenc
t using without self citations, dangling nodes, teleportation
The (AI) score
AI=Per article influence as JIF;
e AI /
Size free score
i iia=
r P
EF �
06/12/2016 34JL Foulley- Applibugs
Journal Abbreviations (from Varin et al, 2016)
06/12/2016 JL Foulley- Applibugs 35
R& R Journals/ExampleRating & Ranking of 47 Statistical Journals in 2015 on Impact Factor (IF), EigenFactor (EF) and Article
Influence Scores (AI) from citations given in 2015 to items published 2 yrs (IF) or 5 yrs before (EF,AI)
(Source Journal Citation Reports)
No Journal IF No Journal EF No Journal AI
1 SMMR 4.634 1 AOS 0.04095 1 JRSS-B 7.822
2 JRSS-B 4.222 2 STMED 0.03354 2 JSS 5.283
3 AOS 2.780 3 JASA 0.03342 3 AOS 5.099
4 JSS 2.379 4 JSS 0.03124 4 STSCI 3.757
5 STSCI 2.213 5 CSDA 0.02068 5 JASA 3.641
6 BIOST 2.109 6 JRSS-B 0.01916 6 SMMR 2.877
7 STCMP 1.786 7 BKA 0.01630 7 BKA 2.785
8 JCGS 1.755 8 BCS 0.01548 8 JCGS 2.299
9 JASA 1.725 9 JSPI 0.01460 9 BIOST 2.0179 JASA 1.725 9 JSPI 0.01460 9 BIOST 2.017
10 JRSS-A 1.702 10 JMA 0.01440 10 STCMP 2.011
11 STMED 1.533 11 STSIN 0.01200 11 STSIN 1.978
12 TECH 1.435 12 SPL 0.01142 12 ISR 1.890
13 BERN 1.372 13 STSCI 0.01000 13 BERN 1.692
14 BCS 1.360 14 BERN 0.00955 14 STATAJ 1.639
15 JRSS-C 1.354 15 STCMP 0.00934 15 TECH 1.604
16 STATAJ 1.292 16 JCGS 0.00915 16 TEST 1.582
17 ISR 1.240 17 BIOST 0.00842 17 BCS 1.548
18 AMS 1.215 18 SMMR 0.00740 18 JRSS-A 1.459
19 TEST 1.207 19 CSTM 0.00554 19 STMED 1.394
20 CSDA 1.179 20 STATAJ 0.00550 20 JRSS-C 1.331
21 ENVR 1.160 21 JSCS 0.00526 21 SJS 1.304
22 BKA 1.130 22 SJS 0.00479 22 JTSA 1.153
23 JTSA 1.000 23 JRSS-A 0.00457 23 JMA 1.072
24 STMOD 0.932 24 JAS 0.00454 24 ENVR 1.009
06/12/2016 36JL Foulley- Applibugs
R & R Journals/ExampleRating & Ranking of 47 Statistical Journals in 2015 on Impact Factor (IF), EigenFactor (EF) and Article
Influence Scores (AI) from citations given in 2015 to items published 2 yrs (IF) or 5 yrs before (EF,AI)
Source: Journal Citation Reports
No Journal IF No Journal EF No Journal AI
25 SJS 0.908 25 ENVR 0.00443 25 CSDA 0.939
26 JBS 0.882 26 TECH 0.00439 26 AMS 0.915
27 JMA 0.857 27 JTSA 0.00415 27 AISM 0.836
28 STSIN 0.838 28 JRSS-C 0.00413 28 STMOD 0.830
29 LDA 0.810 29 BIOJ 0.00346 29 LDA 0.812
30 JABES 0.790 30 TEST 0.00335 30 BIOJ 0.791
31 STPAP 0.781 31 AISM 0.00315 31 JSPI 0.770
32 EES 0.769 32 JBS 0.00313 32 CJS 0.753
33 AISM 0.768 33 JNS 0.00302 33 STNEE 0.738 33 AISM 0.768 33 JNS 0.00302 33 STNEE 0.738
34 JSCS 0.749 34 ISR 0.00292 34 JNS 0.726
35 JSPI 0.727 35 CSSC 0.00260 35 JABES 0.690
36 BIOJ 0.683 36 AMS 0.00248 36 JBS 0.522
37 MTKA 0.595 37 STPAP 0.00245 37 JSCS 0.506
38 STATS 0.530 38 CJS 0.00237 38 STPAP 0.504
39 CMPST 0.520 39 CMPST 0.00221 39 MTKA 0.502
40 SPL 0.506 40 MTKA 0.00205 40 ANZS 0.495
41 JNS 0.446 41 STATS 0.00180 41 SPL 0.493
42 STNEE 0.432 42 LDA 0.00178 42 EES 0.442
43 ANZS 0.431 43 JABES 0.00172 43 CMPST 0.417
44 JAS 0.419 44 STMOD 0.00148 44 STATS 0.358
45 CJS 0.413 45 STNEE 0.00138 45 JAS 0.343
46 CSSC 0.397 46 EES 0.00118 46 CSSC 0.251
47 CSTM 0.300 47 ANZS 0.00096 47 CSTM 0.239
06/12/2016 37JL Foulley- Applibugs
R& R Journals/IF & AI
Despite high correlation between IF & AI,
there are major changes in ranking
06/12/2016 JL Foulley- Applibugs 38
Rating & Ranking Journals/Prestige vs Popularity
06/12/2016 39JL Foulley- Applibugs
R& R/Previous works/Pinski-Narin
( ) ( )
( )
( ) ( )
1
1 1
1
1 1
0 1
1, 2,..., ;
normalized such that / 1
1, / cites received/cites given by
(1976): Influence Weight
N Nijk k
j i i ik ii kj
N Nk
j j jj j
j j j j
cW W N s c c
s
W s s
W W c c j
++= =
+
= =
+ +
= = = =
=
= = =
 Â
 Â
Pinski G & Narin F
(Arbel & Robert
1 1 1
, 2015)
/ / / , , with 1
Equivalent to with
/ Proportion of references journal gives to othe
/
r journals
i ij
T
N N Nij
j j i i ij i i ii i ii
T
i
r rr
i
i
i
rW
cW s s W s s p W s s r
s
s s i
s s
+ + +
+
= = =
+
= =
= = =
=
  Â
r P r
����� ����������
06/12/2016 40JL Foulley- Applibugs
R& R/Previous works/Pinski-Narin
Equivalent to and /
/ Proportion of r
I
e
nfluence Weight (197
ferences journal gives to other journals
6)
T T i
i
i
i
rW
s s
s s i
+
+
= =
=
Pinski G & Nar
r P r
in F
AI / Prop. of total papers from journali
i i i
i
ra a a
a+= = =�
� published in -1,..., - 2
Total influence: " " without teleportation ( =1)
Influence per citation "Weight influence":
Influence per publicat
TI
W
IPP = /ion: AI without teleporta
i
i
i
i
i i i
i i i
i t t T
W s
a
E
W
s r F
s
a+= = µ
� tion ( =1)a
06/12/2016 41JL Foulley- Applibugs
R& R/Related approaches
1
Invariant Method
1, 2,.
: Palacios-Huerta I & Volij (2004)
(Axiomatic approach)
,..N ij i
j ii
c av v j N
a c== =Â 1
1
1 1
1
1, 2,.
/ =Re ference intensit
'Article Influence' w
,
y
..j ii
T T
a c a
j i
TT T
c
T
i i
a a c
v v D D CD
v v j Na c
v v D Cv D D
v
a
v C
c
D
- -
-
=+
-
+
=
=
= ¤
ª
=
=
Â
ithout teleportation
06/12/2016 42JL Foulley- Applibugs
R& R/Related approaches
: Zitt & Small (2008), Zitt (2010)
Citations weighted by the reciprocal of the average no
of references per article from the citing journal
in -1,..., - 5 yrs (to c
Audi
orrect for field
ence Factor
de
im i
t t pendency of IF)in -1,..., - 5 yrs (to correct for field det t
1
(ie =0 in
pendency o
f IF)
/
If EF )
N ij
j i i
i
ii
i
j
i
w
v cst IM AF
A w m mc
Fa
a
+=
= fi
=
µ
=Â
06/12/2016 43JL Foulley- Applibugs
R& R /Previous works/SCImago
( ) ( )( ) ( )
�
( ) ( )( )( )
1
1
213
Borja González-Pereira, Vicente P Guerrero-Bote,
(2010) . 4, 379 39
1
1
/Nk k k
j j ij i F j ii i DN
The SJR indicator Journal of
PSJR d e N
informetric
ea d p PSJR C a PSJR
s
+
= Œ= - - + + +
-
 Â
Félix Moya - Anegón
� �������� ��������� �
( ) ( ) ( )( )
1 , 2 "teleportation" 3 Influence of journals citing j
0.9; 0.0999 / ; =(1- )/ normalizing constant
/ : Prestige Size Free Index (by article normalisation as in AI)
T T
i i F
i i j
d e a a a C
SJR PSJR a
+= = =
=
r b r P1
�������
�
�/ : Prestige Size Free Index (by article normalisation as in AI)
Ma
i i jSJR PSJR a= �
in differences with EF, AI
PSJR based completely on
Damping factor: in SJR vs 0.85 in EF
Time Window vs 5 yrs in
Weighted PageRank
0.9
3yrs
Self citat
EF,AI
included (max 33%)
Usi
ions
Scopusn g data b
-
-
-
-
-
( ). 2007 . — &
(Elase
: / / .
sevier) vs
.
Tho
21
mson-Reute
, 20
r
.
s
15,
SCImago SJR SCImago Journal Countr
http
y Ran
ww
k
Retr w scieved J imagoju rly from com
06/12/2016 44JL Foulley- Applibugs
R&R/Bayesian reinterpretation of the G matrix
( )( )
( )( ) { }
1
1
1
(citing) row of is
row of is , ,..., ,...,
1)
with element : 1
~ , ; 1,..., where
No of citations given by
1
to
ij ij j
NT
i i i ij i ijj
th
th T
i i i ij iN
T
T
i
T
ii
i
T
j g p a
s j N s c
s
i
i c c c c
i
a a
a a
q=
= +
= + -
=
-
= = =
=
Âθ θ
p aG
C C
C
G
�
�
M
( ) { }all journals (outlinks from )i
( ) { }( ) { }
( )�
1
1 1
2) ~ ; 1,..., (Dirichlet distribution)
3)
r
, ~ ; 1,.
ow of
..,
ˆ
i i
j
i
T
i j
T
i j
ij j j
P ij ij ijJ J
i
iij jj
t
ij
i
h
n
T
s
s
j N
j N
cE
sp
c
i
pl a a
g
g
g
g
ll
g
ll= =
-
= =
+ = =
+= =
+ += +
+Â Â
θ | γ γ γ
θ | γCγ γC
θ θ
����� ����� ��� ���
D
D
( )* 1 new T
i i i
T T
ia a= + -pG πG
06/12/2016 45JL Foulley- Applibugs
R&R/Bayesian reinterpretation of the G matrix
( )* * 1
: tantamount to no of citations given by a journal
wher
1)Pretermined values ass
e harmonic mean of ; 0.15 0.25
ig
row of new
n
to ed
T T
i i i
i
th T
ii
ks s s k
a a
ll
+ -
= = = -
= πG G p
� �
similar approach as for historica
( )1
/ Prop of cita
l d
tions to by all journals
ata
jj N
j jc c j
p
p + ++
£ £
=
=
=
π
06/12/2016 46JL Foulley- Applibugs
R&R/Bayesian reinterpretation of the G matrix
( )
( ) ( )
1
1 1
* * 1
1 where intra-class correlation
ANOVA estimators in Landis &
2) Empirical Bayes
K
row of new
och,1977
weighted mean of row transi
via e
tion m
g BLUP
ˆ /
N N
j i ij ii
T T
i i i
i
th T
ii
pp a a
a a
l r r-
= =
+ -
=
-
=
=
-
 Â
G p πG
atrices
0 0 // for any ji ij j j j jg c a IFicr a p p+ ++Æ fi Æ Æ = µfi
( ) ( ) 11
1 0 0
*
*
0 0
0 0 /
1 and/or large 1, (no teleportation) Pinski-Narin
More generally
with
, Between,Within
/ for any
j
T
i
T
i
i ij j j j j
T
i i i
TT T T
i i i i i i N
g c a IFi
s
s s I
cr a p p
r a+
-
+
--
+Æ fi Æ Æ = µ
Æ fi =
+ - =
fi
=
+=
p
p A π I A A
B
G W
G
B
W
( )
Quaas & VanVleck, 1980; Foulley et al,
components of var-covar among citing ite
1987
3) Hierarchical
ms
~ ~ ( ,
Bayes
)T Tc a blπ D G
06/12/2016 47JL Foulley- Applibugs
R&R/P matrix estimated via g-logit model
( )( ) { }
1
1
1 row of is , ,..., ,..
4) P estimated by alterna
~ , ; 1,..., where
No of
.,
1
tive models: g-logit m
citations given by to
)
odel with random effects
th T
i i i ij iN
NT
i i i ij i
T
i ijj
i
i c c c c
s j N s c
s i
q=
= = =
=
=
Âθ θ
C C
C M
all journals (outlinks from )
exp( )j ij
i
ubq
+= = -
( ) ( )
[ ]
1
1
1 1
exp( )2) 1,..., 1
1 exp( )
: fixed effect
,..., ,..., ~ 0,
saturated, diagonal (homogeneous or not)
3) ~ (, ) or ~ (, )
j ij
ij N
j ijj
j
i i ij iJ i
jj
uj N
u
u u u
bq
b
b
-
=
-
+= = -
+ +
=
Â
u u D
D
D D
N
IW IG
06/12/2016 48JL Foulley- Applibugs
Citation distribution
0 113 19.72
1 93 35.95
2 97 52.88
3 85 67.71
4 62 78.53
5 45 86.38
6 28 91.27
7 16 94.07
8 10 95.81
9 7 97.03
10 3 97.56
11 4 98.25
12 2 98.6
13 1 98.78
Larivière et al (2016) Citations (573) received in 2014 by Biology Letters of citable papers
published in 2002-2013
06/12/2016 JL Foulley- Applibugs 49
13 1 98.78
14 1 98.95
15 0 98.95
16 0
17 1
18 2
19 0
20 0
21 1
22 0
23 1
24 0
25 0
26 0
27 0
28 1
Citation distribution
06/12/2016 JL Foulley- Applibugs 50
R&R/Poisson QS loglinear, Bradley-Terry & Stigler models
( )( ) ( ) ( ) ( )
( ) ( )
1 2 12
12 12
Caussinus (1965) Goodman (2002)
~ i citing in rows & j cited in column
A) Poisson QS Loglinear model
B)Brad
:
ley-
s
2) ln
3) for any 1, 2,...,
Terry model fo
1) | ij id ij
ij i j
ij ji
i
ij
j
i
C
j N
m m
m m b b b
b b
= + + +
= π =
P
( )~ where | ,
r pairwise comparisons
BC N N N C Cp = +( )( )
( ) ( )( ) ( ) ( )( )( ) ( ) ( )
2 1 2 1
2 1
~ where
/
| ,
ln
Cited -Citing Effect /Reference
VarStigler (1994) "Export Score i
ln
logit ;
" n e
ij ij ji
j j
ij ij ij ij ij ij ji
ij
ij ij
ji ji
ij j
i
i
i
i ii
BC N N N C C
m m m
b b b b
p
p
p m
p
b b
m
p q q q
+
- -
= +
=
= =
-
-
= - =
( )( ) ( )( )t al (2015)
N = 1 2 /2 ; DL 1 2 /2P R
N N N N- + = - -
06/12/2016 51JL Foulley- Applibugs
R&R/Relationship between BT and AHP-ANR
[ ]
Analytic Hierarchy and Network Process based
on the Pairwise Comparison Matrix (PCM)
1 =1; AHP, /
Several methods to get (Saaty, 1977)
Transition matrix: /
ij
ij ii ij j i
ji ji
i
ca a a w w
a c
w
a a
= = ª
=P[ ]Transition matrix: /
Solve for
ij iija a +=P
( ) ( )1
(Choo & Wedley, 2009)
Remember in BT:ln /
If replaced by / , th
exp / e
en
xp
ij
T T
N
ji
i i kk
j i
ij ij jia
r
m m q q
q q
m m
=
=
È ˘= Î ˚
= -
Â
r P r
06/12/2016 52JL Foulley- Applibugs
R&R/Applications to statistical journals
Analysis of the 47x47 Cross Citation matrix
used in Varin et al without self citations
Based on ANOVA Landis & Koch (1977)
ˆ
06/12/2016 53JL Foulley- Applibugs
( )
(H)
ˆˆ 0.0267 36.381
varying from 0.51 to 0.97
0.841 0.866
173.82
i
med
s k
r la
a a
= =
= =
= = 0.209
R&R/Scores of the 47 journals
06/12/2016 54JL Foulley- Applibugs
R&R/Applications to statistical journals
Rankings of 47 Statistical Journals according to several criteria
06/12/2016 55JL Foulley- Applibugs
EXS : Export Score; IW: Influence Weight Pinski & Narin
AI : Article Influence, AI-BLUP : same but based on BLUP matrix
IPP : Influence per publication : Pinski & Narin (1976)
SJR: SCImago Journal Rank (per article normalization)
Citations received in2010 JCR from papers published in 2001-2010
(Varin et al, 2016)
R&R/Applications to statistical journals
Rankings of 47 Statistical Journals (continued)
06/12/2016 56JL Foulley- Applibugs
R&R/Applications to statistical journals
Correlations among 6 different criteria of journal ratings
06/12/2016 57JL Foulley- Applibugs
EXS & IW : influence scores per citation
AI, AI-BLUP, IPP, SJR: influence scores per article
R&R/Discussion-Conclusion
Model-based approaches better suited to Journal R&R than raw indicators
-Hypotheses made explicit
-Uncertainty available
Network approaches are attractive due to simplicity & computability
Several extensions made or in progress
06/12/2016 58JL Foulley- Applibugs
Several extensions made or in progress
-Grouping Lasso (Varin et al, 2015)
-Including Dynamics
-Goodman's Row-Column models (Grah, 2016)
-Stochastic Block Models (Wyse & White, 2016 in Varin et al)
-Clustering with Modularity Classes (Arbel & Robert in Varin et al)
06/12/2016 JL Foulley- Applibugs 59
(Arbel & Robert, 2015)
R&R/Discussion-Conclusion
Practical side
-Multidimensional rating (Prestige vs Popularity)
-Solutions to avoid manipulation (Archambault & Larivière)
-Algorithm kept secret
-Monitoring journal behaviour (including sanctions)-Monitoring journal behaviour (including sanctions)
-Avoid misuse of indicators
-Indicators specific to different levels of evaluation
-Field dependency to take into account
-Poor correlation between JIF and citation rate of individuals
06/12/2016 60JL Foulley- Applibugs
References
Agresti, A. (2013) Categorical Data Analysis, 3rd edn. New York: Wiley
Archambault, E. and Lariviere, V. (2009) History of the journal impact factor: contingencies and consequences. `
Scientometrics, 79, 635–649.
Bergstrom, C. (2007) Eigenfactor: measuring the value and the prestige of scholarly journals. Coll. Res. Lib. News
Brin, S. and Page, L. (1998) The anatomy of a large-scale hypertextual web search engine. Comput. Netwrks ISDN Syst., 30,
107–117.
Caussinus H (1966) Contribution à l’analyse statistique des tableaux de correlation. Annales de la Faculté des Sciences de
Toulouse, 29, 77-182.
Christensen R (1990) Log-Linear Models. Springer-Verlag, 408 pages
Firth, D. (2009) Bradley–Terry models in R. J. Statist. Softwr., 12, 1–12.
Foulley J.L., Gianola D., Im S., 1987. Genetic evaluation for discrete polygenic traits in animal breeding. In Gianola D.,
Hammond K. (eds.). Statistical Methods for Genetics Improvement of Livestock, 1990, 361-409. Springer Verlag,
Heidelberg.
Franceschet, M. (2012) The large-scale structure of journal citation networks. J. Am. Soc. Inform. Sci. Technol., 63, 837–842.
Garfield, E. (1955) Citation indices for Science. Science, 122, 108–111.
Grah S (2016) Ranking and Rating of Scientific Journals. M1 Master Report, Paris Sud Orsay, 54p
06/12/2016 JL Foulley- Applibugs 61
Grah S (2016) Ranking and Rating of Scientific Journals. M1 Master Report, Paris Sud Orsay, 54p
Garfield, E. (1972) Citation analysis as a tool in journal evaluation. Science, 178, 471–479.
Goodman L A (2002) Contributions to the statistical analysis of contingency tables: Notes on quasi-symmetry, quasi-
independence, log-linear models, log bilinear models, and correspondence analysis models. Annales de la Faculté des
Sciences de Toulouse, 6ème série, 11,(4), 525-540.
Palacios-Huerta, I. and Volij, O (2004) The measurement of intellectual influence. Econometrica, 72, 963–977
Quaas RL, Van Vleck (1980) Categorical trait sire evaluation by BLUP of future category frequencies. Biometrics, 36, 117-122
Seglen, P. O. (1997) Why the impact factor of journals should not be used for evaluating research. Br. Med. J., 314, 498–502.
Stigler SM (1994) Citations Patterns in the Journals of Statistics and Probability. Statistical Science, 9, 94-108
Varin C, Cattelan M, Firth D (2016) Statistical modelling of citation exchange between statistics journals. JR Statistical Society
A,179,1-63
Waltman L, van Eck NJ (2010) The relation between Eigenfactor, audience factor and influence weight. J of the American
Society for Information Science & Technolgy, 61, 1476-1486.
Zitt, M. and Small, H. (2008) Modifying the journal impact factor by fractional citation weighting: the audience factor. J. Am.
Soc. Inform. Sci. Technol., 59, 1856–1860.