View
36.754
Download
0
Tags:
Embed Size (px)
Citation preview
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Big dataNetworks from data bases
Vladimir Batagelj
University of Ljubljana
Undicesima conferenza nazionale di statisticaRome, February 20-21, 2013
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Outline
1 Two mode networks2 Multiplication3 Derived networks4 Pajek
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Example: Internet Movie Data Base
Die Another Day
Casino Royale
Skyfall
Lee Tamahori
Martin Campbell
Paul Haggis
Sam Mendes
Neal Purvis
Robert Wade
John Logan
Ian Fleming
Pierce Brosnan
Daniel Craig
Judi Dench
Halle Berry
Javier Bardem
Ralph Fiennes
Eva Green
Mads Mikkelsen
On February 17, 2013 IMDB (Internet Movie Data Base) contained 2,262,638 titles and 4,745,392 names.Web of Science, Scopus, Zentralblatt Math, Google Scholar, DBLP, Amazon, etc.
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Two mode networks from data bases
A simple data base B is a set of records B = {Rk : k ∈ K}, where K is theset of keys. A record has the form Rk = (k, q1(k), q2(k), . . . , qr (k)) whereqi (k) is the description of the property (attribute) qi for the key k.Suppose that the description q(k) takes values in a finite set Q. It canalways be transformed into such set by partitioning the set Q and recodingthe values. Then we can assign to the property q a two-mode networkK × q = (K,Q,L,w) where (k, v) ∈ L iff v ∈ q(k). w(k, v) is the weightof the link (k, v); often w(k, v) = 1.Single-valued properties can be represented by a partition.
Examples:(papers, authors, was written by),(papers, keywords, is described by),(parlamentarians, problems, positive vote),(persons, journals, is reading),(persons, societies, is member of, years of membership),(buyers/consumers, goods, bought, quantity), etc.
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Methods: degree distributions
In a network (V,L) the degree deg(v) of vertex v ∈ V is equalto the number of links that have vertex v as their end-vertex.The indegree / outdegree is equal to the number of incoming /outgoing links.Usually one of the first analyses of a network is to look at itsdegree distribution(s). Are there isolated nodes (deg(v) = 0)?Which are the nodes with the largest degrees? What is theaverage degree? What is the shape of degree distribution?
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Methods: two-mode cores and 4-rings weights
The subset of vertices C ⊆ V is a (p, q)-core in a two-mode networkN = (V1,V2;L), V = V1 ∪ V2 iff
a. in the induced subnetwork K = (C1,C2;L(C )), C1 = C ∩ V1,C2 = C ∩ V2 it holds ∀v ∈ C1 : degK(v) ≥ p and∀v ∈ C2 : degK(v) ≥ q ;
b. C is the maximal subset of V satisfying condition a.
A k-ring is a simple closed chain of length k . Using k-rings we candefine a weight of edges aswk(e) = # of different k-rings containing the edge e ∈ E
In two-mode network there are no 3-rings. The densest substructures arecomplete bipartite subgraphs Kp,q.They contain many 4-rings. There-fore these weights can be used toidentify the dense parts of a network.
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Example: (247,2)-core and (27,22)-core in IMDB
Royal Rumble
Survivor Series
Dumas, AmyEllison, LillianGarcía, LiliÆnGuenard, NidiaHulette, ElizabethKai, LeilaniKeibler, StacyLaurer, JoanieMartel, SherriMartin, Judy (II)McMahon, StephanieMcMichael, DebraMero, RenaMoore, Carlene (II)Moore, Jacqueline (VI)Moretti, LisaPsaltis, Dawn MarieRobin, Rockin’Runnels, TerriStratus, TrishVachon, AngelleWilson, TorrieWright, JuanitaYoung, Mae (I)Adams, Brian (VI)Ahrndt, JasonAl-Kassi, AdnanAlbano, LouAnderson, ArnAndrØ the GiantAngle, KurtAnoai, ArthurAnoai, MattAnoai, RodneyAnoai, SamAnoai, SolofatuApollo, PhilAustin, Steve (IV)Backlund, BobBarnes, Roger (II)Bass, Ron (II)Batista, DaveBenoit, Chris (I)Bigelow, Scott ’Bam Bam’Bischoff, EricBlackman, Steve (I)Blair, Brian (I)Blanchard, TullyBlood, RichardBloom, Matt (I)Bloom, WayneBresciano, AdolphBrisco, GeraldBrunzell, JimBuchanan, Barry (II)Bundy, King KongCalaway, MarkCandido, ChrisCanterbury, MarkCena, John (I)Centopani, PaulChavis, ChrisClarke, BryanClemont, PierreCoachman, JonathanCoage, AllenCole, Michael (V)Connor, A.C.Constantino, RicoCopeland, Adam (I)Cornette, James E.Darsow, BarryDavis, Danny (III)DeMott, WilliamDiBiase, TedDouglas, ShaneDuggan, Jim (II)Eadie, BillEaton, Mark (II)Enos, Mike (I)Eudy, SidFarris, RoyFatu, EddieFifita, UliuliFinkel, HowardFlair, RicFoley, MickFrazier Jr., NelsonFujiwara, HarryFunaki, ShoGarea, TonyGasparino, PeterGill, DuaneGoldberg, Bill (I)Gray, George (VI)Guerrero Jr., ChavoGuerrero, EddieGunn, Billy (II)Guttierrez, OscarHall, Scott (I)Hardy, Jeff (I)Hardy, MattHarris, Brian (IX)Harris, Don (VII)Harris, Ron (IV)Hart, BretHart, Jimmy (I)Hart, OwenHart, StuHayes, Lord AlfredHeath, David (I)Hebner, DaveHebner, EarlHeenan, BobbyHegstrand, MichaelHelms, ShaneHennig, CurtHenry, Mark (I)Hernandez, RayHeyman, PaulHickenbottom, MichaelHogan, HulkHollie, DanHorn, BobbyHorowitz, BarryHouston, SamHoward, JamieHoward, Robert WilliamHuffman, BookerHughes, DevonHyson, MattJackson, TigerJacobs, GlenJames, Brian (II)Jannetty, MartyJarrett, Jeff (I)Jericho, ChrisJohnson, Ken (X)Jones, Michael (XVI)Keirn, SteveKelly, Kevin (VIII)Killings, RonKnight, Dennis (II)Knobs, BrianLauer, David (II)Laughlin, Tom (IV)Laurinaitis, JoeLawler, Brian (II)Lawler, JerryLayfield, JohnLeinhardt, RodneyLeslie, EdLesnar, BrockLevesque, Paul MichaelLevy, Scott (III)Lockwood, MichaelLoMonaco, MarkLong, TeddyLothario, JoseManna, MichaelMarella, Joseph A.Marella, RobertMartel, RickMartin, Andrew (II)Matthews, Darren (II)McMahon, ShaneMcMahon, VinceMero, MarcMiller, ButchMoody, William (I)Mooney, Sean (I)Morgan, Matt (III)Morley, SeanMorris, Jim (VII)Muraco, DonNash, Kevin (I)Neidhart, JimNord, JohnNorris, Tony (I)Nowinski, ChrisOkerlund, GeneOrton, RandyOttman, FredPage, DallasPalumbo, Chuck (I)Peruzovic, JosipPettengill, ToddPfohl, LawrencePiper, RoddyPlotcheck, MichaelPoffo, LannyPowers, Jim (IV)Prichard, TomRace, HarleyReed, Bruce (II)Reiher, JimReso, JasonRhodes, Dusty (I)Rivera, Juan (II)Roberts, Jake (II)Rock, TheRoss, Jim (III)Rotunda, MikeRougeau Jr., JacquesRougeau, RaymondRude, RickRunnels, DustinRuth, GlenSags, JerrySaturn, PerrySavage, RandyScaggs, CharlesSenerca, PeteShamrock, KenShinzaki, KensukeSimmons, Ron (I)Slaughter, Sgt.Smith, Davey BoySnow, AlSolis, MercidSteiner, Rick (I)Steiner, ScottStorm, LanceSzopinski, TerryTajiri, YoshihiroTanaka, PatTaylor, Scott (IX)Taylor, Terry (IV)Tenta, JohnTraylor, RaymondTunney, JackVailahi, SioneValentine, GregVan Dam, RobVaziri, Kazrowvon Erich, KerryWalker, P.J.Waltman, SeanWare, David (II)Warrington, ChazWarriorWhite, LeonWickens, BrianWight, PaulWilson, Al (III)Wright, Charles (II)Zhukov, Boris (I)
Fully Loaded
Invasion
King of the Ring
No Way Out
Royal Rumble
Summerslam
Survivor Series
Wrestlemania 2000
Wrestlemania X-8
Wrestlemania X-Seven
WWE Armageddon
WWE Judgment Day
WWE No Mercy
WWE No Way Out
WWE SmackDown! Vs. Raw
WWE Unforgiven
WWE Vengeance
WWE Wrestlemania X-8
WWE Wrestlemania XX
WWF Backlash
WWF Insurrextion
WWF Judgment Day
WWF No Mercy
WWF No Way Out
WWF Rebellion
WWF Unforgiven
WWF Vengeance
’Raw Is War’
’Sunday Night Heat’
’WWE Velocity’
’WWF Smackdown!’
Dumas, Amy
Keibler, StacyMcMahon, Stephanie
Stratus, TrishAngle, KurtAnoai, SolofatuAustin, Steve (IV)Benoit, Chris (I)Bloom, Matt (I)Calaway, MarkCole, Michael (V)Copeland, Adam (I)Guerrero, EddieGunn, Billy (II)Hardy, Jeff (I)Hardy, Matt
Hebner, EarlHeyman, PaulHuffman, BookerHughes, Devon
Jacobs, GlenJericho, ChrisLawler, JerryLayfield, JohnLevesque, Paul Michael
LoMonaco, Mark
Martin, Andrew (II)
Matthews, Darren (II)
McMahon, ShaneMcMahon, VinceReso, JasonRock, TheRoss, Jim (III)Senerca, PeteSimmons, Ron (I)
Taylor, Scott (IX)Van Dam, Rob
Wight, Paul
IMDB 2005: n1 = 428440, n2 = 896308, m = 3792390.
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Example: Islands for w4 / Charlie Brown and Adult
Be My Valentine, Charlie Brown
Boy Named Charlie Brown
Charlie Brown Celebration
Charlie Brown Christmas
Charlie Brown Thanksgiving
Charlie Brown’s All Stars!
He’s Your Dog, Charlie Brown
Is This Goodbye, Charlie Brown?
It’s a Mystery, Charlie Brown
It’s an Adventure, Charlie Brown
It’s Flashbeagle, Charlie Brown
It’s Magic, Charlie Brown
It’s the Easter Beagle, Charlie Brown
It’s the Great Pumpkin, Charlie Brown
Life Is a Circus, Charlie Brown
Making of ’A Charlie Brown Christmas’
Play It Again, Charlie Brown
Race for Your Life, Charlie Brown
Snoopy Come Home
There’s No Time for Love, Charlie Brown
You Don’t Look 40, Charlie Brown
You’re a Good Sport, Charlie Brown
You’re In Love, Charlie Brown
You’re Not Elected, Charlie Brown
Charlie Brown and Snoopy ShowAltieri, Ann
Dryer, Sally
Mendelson, Karen
Momberger, Hilary
Stratford, Tracy
Brando, Kevin
Hauer, Brent
Kesten, Brad
Melendez, Bill
Ornstein, Geoffrey
Reilly, Earl ’Rocky’
Robbins, Peter (I)
Schoenberg, Jeremy
Shea, Christopher (I)
Shea, Stephen
Pajek
Boy, T.T.
Byron, Tom
Davis, Mark (V)
Dough, Jon
Drake, Steve (I)
Horner, Mike
Jeremy, Ron
Michaels, Sean
Morgan, Jonathan (I)
North, Peter (I)
Sanders, Alex (I)
Savage, Herschel
Silvera, Joey
Thomas, Paul (I)
Voyeur, Vince
Wallice, Marc
West, Randy (I)
Pajek
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Sparsity and Dunbar’s number
Networks obtained from data bases are usually large – tens ofthousands or millions of nodes. Large networks are usuallysparse – they have small average degree.
In one-mode networks describing relations among people thiscan be related to Dunbar’s number with a value around 150.See Wikipedia: Dunbar’s number.
In general, if initiator of a link wants to keep the link he shouldspend / invest a certain amount of finite total ”energy” he has.
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Multiplication of networks
To a simple two-mode network N = (I,J , E ,w); where I and J aresets of vertices, E is a set of edges linking I and J , and w : E → R(or some other semiring) is a weight; we can assign a network matrixW = [wi,j ] with elements: wi,j = w(i , j) for (i , j) ∈ E and wi,j = 0otherwise.Given a pair of compatible networks NA = (I,K, EA,wA) andNB = (K,J , EB ,wB) with corresponding matrices AI×K and BK×Jwe call a product of networks NA and NB a networkNC = (I,J , EC ,wC ), where EC = {(i , j) : i ∈ I, j ∈ J , ci,j 6= 0} andwC (i , j) = ci,j for (i , j) ∈ EC . The product matrixC = [ci,j ]I×J = A ∗ B is defined in the standard way
ci,j =∑k∈K
ai,k · bk,j
In the case when I = K = J we are dealing with ordinary one-modenetworks (with square matrices).
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Multiplication of networks
KI
J
i
k
j
A B
ai,k
bk,j
ci,j =∑k∈K
ai,k · bk,j
If all weights in networks NA and NB are equal to 1 the value of ci,jcounts the number of ways we can go from i ∈ I to j ∈ J passing
through K.
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Multiplication of networks
The standard matrix multiplication has the complexityO(|I| · |K| · |J |) – it is too slow to be used for large networks.For sparse large networks we can multiply much fasterconsidering only nonzero elements.In general the multiplication of large sparse networks is a’dangerous’ operation since the result can ’explode’ – it is notsparse.If for the sparse networks NA and NB there are in K only fewvertices with large degree and no one among them with largedegree in both networks then also the resulting productnetwork NC is sparse.
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Derived networks
From a bibliographical data base we get two-mode networks WA =Works × Authors and WK = Works × Keywords. Since they have acommon set Works the networks WAT and WK are compatible andmultiplying them we obtain a derived network
AK = WAT ∗WK
The entry akit = number of times author i used in his/her workskeyword t.
The dataset of EU projects on simulation (January 2006) containsdata about research groups. We obtain networks: P = Groups ×Projects, C = Groups × Countries, and U = Groups × Institutions.Sizes: |Groups| = 8869, |Projects| = 933, |Institutions| = 3438,|Countries| = 60.
In the derived network W = Projects × Institutions = PT ∗U we
determine link islands for w4.
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Analysis of Projects × Institutions
502909
IST-2000-30082
G4RD-CT-2000-00395
IST-2000-29207
502842
IST-2000-28177
G4RD-CT-2002-00795
G4RD-CT-2000-00178BRPR987001
G4RD-CT-2002-00836
502896
502889
502917
G4RD-CT-2001-00403
G4MA-CT-2002-00022
BRST985352
506257
28283
506503
EVG3-CT-2002-80012
50108429817
ENK6-CT-2002-30023
IST-2000-30158
511758
7210-PR/142
25525
7215-PP/034
T3.5/99
JOE3980089
SMT4982223
7210-PR/163
7210-PR/233
7215-PP/031
T3.2/99
IST-1999-56418
IST-1999-57451
HPSE-CT-2002-00108IST-2001-35358
ENK5-CT-2000-00335
JOR3980200
QLK6-CT-2002-02292
7210-PR/095
HPSE-CT-2002-00143
A.S.M. S.A.
AGRO-SAT CONSULTING
AIRBUS DEUTSCHLAND
AIRBUS FRANCE SAS
AIRBUS UK LIMITED
ALBERTSEN & HOLM AS
ALENIA AERONAUTICA SPA
ARMINES
ASM - DIMATEC INGENIERIA
BAE SYSTEMS
BARCO NV
BARTENBACH
BAYER. ROTES KREUZ
BBL
BICC GENERAL CABLE
BRITISH STEEL
BROD THOMASSON
BUILDING RESEARCH
BUURSKOV
CATALYSE SARL
CENTRE DE RECH. METALLURG.
CENTRE DE ROBOTIQUE
CENTRE FOR EUROP. ECONOMIC
CSTB
C. R. FIAT S.C.P.A.
CHALMERS TEKNISKA HOEGSKOLA
CHIPIDEA - MICROELECTRONICA, S.A.
CINAR LTD.
COLOPLAST A/S
CRE GROUP LTD.
DAIMLER CHRYSLER AG
DASSAULT AVIATION
DATASYS S.R.O.
DE ZENTRUM FUER LUFTUND RAUMFAHRT E.V.
DFA DE FERNSEHNACHRICHTEN AGENTUR
DISENO DE SISTEMAS EN SILICIO
DPME ROBOTICS AB
EA TECH. LTDEADS DE
EDAG ENGINEERING + DESIGN
ENEL.IT
ENERGITEKNIK HEATEX AB
ENERGY RESEARCH CENTRE NL
ESI SOFTWARE SA
EUROCOPTER S.
FFT ESPANA TECH. DE AUTOMOCION,
FONDAZIONE ENI - ENRICO MATTEI
FRAUENHOFER INST. FUERMATERIALFLUSS UND LOGISTIK
FRAUENHOFER INST. FUER PRODUKTIONSTECH. UND AUTOMATISIERUNG
FRIMEKO INT. AB
GATE5 AG
GUNNESTORPS SMIDE & MEKANISKA AB
HELP SERVICE REMOTE SENSING
IFEN GES. FUER SATELLITENNAVIGATION
ILEVO AB
INDUSTRIAS ROYO
INGENIORHOJSKOLEN HELSINGOR TEKNIKUM
INOX PNEUMATIC AS
INST. CARTOGRAFIC DE CATALUNYA
INST. DE RECHERCHESDE LA SIDERURGIE FR
INST. FUER TEXTIL UNDVERFAHRENSTECH. DENKENDORF
INST. NAT. DE RECHERCHESUR LES TRANSPORTS ET LEUR SÉCURITÉ
INST. SUPERIOR TECNICO
JERNKONTORET
KBC MANUFAKTUR, KOECHLIN,BAUMGARTNER UND CIE. AG
KOMMANDITGES. HAMBURG 1FERNSEHEN BETEILIGUNGS & CO
LANDIS & GYR - EUROPE AG
LESPROJEKT SLUZBY S.R.O.
LH AGRO EAST S.R.O.
LKSOFTWARE
LMS UMWELTSYS.E, DIPL. ING. DR. HERBERT BACK MECALOG SARL
MEFOS, FOUNDATION FORMETALLURGICAL RESEARCH
MJM GROUP, A.S.
MSO CONCEPT INNOVATION + SOFTWARE
MTU AERO ENGINES
NAT. TEC. UNIV. OF ATHENS NL ORG. FOR APPLIEDSCIENTIFIC RESEARCH - TNO
OESTERREICHISCHER BERGRETTUNGSDIENST
OFFICE NAT. DETUDES ETDE REC. AEROSPATIALES
OK GAMES DI ALESSANDRO CARTA
ORAD HI TEC SYS. POLAND
OSAUHING EETRIUKSUS
POLYMAGE SARL
PROLEXIA
PSI FUR PRODUKTE UNDSYS.E DER INFORMATIONSTECH.
RESEARCH INST. OF THE FINNISH ECONOMY
ROSENHEIMER GLASTECH.
RUDOLF BRAUNS AND CO. KG
SHERPA ENGINEERING SARL
SNECMA MOTEURS SA
SPORTART
SSAB TUNNPL¯T
STICHTING NATIONAAL LUCHT
SUPERELECTRIC DICARLO PAGLIALUNGA & C. SASSVETS & TILLBEHOR AB
TECHNOFARMING S.R.L.
TESSITURA LUIGI SANTI SPA
THE AARHUS SCHOOL OF BUSINESS
THYSSENKRUPP STAHL A.G.
TPS TERMISKA PROCESSER AB
TQT SRL
TRUMPF-BLUSEN-KLEIDERWALTER GIRNER UND CO. KG
UAB LKSOFT BALTIC
UNIV. DE ZARAGOZA
UNIV. DER BUNDESWEHR MUENCHEN
UNIV. PANTHEON-ASSAS - PARIS II
UNIV. OF ABERDEEN
UNIV. OF MACEDONIA
VOEST-ALPINE STAHL
VOLKSWAGEN AG
WISDOM TELE VISION
WYKES ENGINEERING COMPANY
YAHOO! DE
ZAMISEL D.O.O
Pajek
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Collaboration networks
Let WA be the works × authors two mode network; wapi ∈ {0, 1} isdescribing the authorship of author i of work p.∑
i∈A
wapi = deg(p) = # of authors of work p
Let N be its normalized version, ∀p ∈W :∑
i∈A npi = 1, obtainedfrom WA by npi = wapi/ deg(p), or by some other rule determiningthe author’s contribution.The first collaboration network Co = WAT ∗WA
coij =∑p∈W
wapiwapj =∑
p∈N(i)∩N(j)
1
coij = the number of works that authors i and j wrote together.Problem: The Co network is composed of complete graphs on theset of work’s authors. Works with many authors produce largecomplete subgraphs.
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Cores of orders 10–21 in Computational Geometry
L.J.Guibas
M.Sharir
L.P.Chew
M.Flickner
M.J.vanKreveld
D.G.Kirkpatrick
W.J.Lenhart
S.P.Fekete
F.Hurtado
B.Chazelle
D.White
K.R.Romanik
N.M.Amato
T.D.Blacker
J.S.Snoeyink
T.C.Shermer
D.Z.Chen
D.P.Dobkin
H.Alt
F.P.Preparata
J.Erickson
J.E.Hershberger
C-K.Yap
M.Whitely
J-D.Boissonnat
S.J.Fortune
R.L.S.Drysdale
J.Harer
D.M.Avis
O.Schwarzkopf
J.S.B.Mitchell
D.Bremner
H.A.El-Gindy
D.Steele
B.Dom
J-R.SackM.H.Overmars
V.Sacristan
O.Aichholzer
R.Pollack
D.H.Rappaport
S.H.Whitesides
D.Eppstein
E.D.Demaine
M.T.Goodrich
D.M.Mount
S-W.Cheng
D.L.Souvaine
S.A.Mitchell
D.PetkovicP.Yanker
M.W.Bern
P.K.Agarwal
I.G.Tollis
T.J.Tautges
H.Edelsbrunner
T.L.Edwards
H.Imai
E.M.Arkin
R.Wenger
S.E.Benzley
P.Plassmann
M.T.deBerg
D.Halperin
T.C.Biedl
W.J.Bohnhoff
J.R.Hipp
P.Belleville
C.Grimm
G.T.Toussaint
M.Yvinec
H.Meijer
Te.Asano
S.S.Skiena
M.Teillaud
H.S.Sawhney
D.Zorin
A.Lubiw
S.Suri
D.T.Lee
R.R.Lober
K.KedemE.Welzl
G.Liotta
J.Pach
P.K.Bose
J.C.Clements
S.R.Kosaraju
J.Weeks
D.Letscher
G.Lerman
J.Czyzowicz
A.Aggarwal
H.Everett
B.Zhu
T.K.Dey
E.Trimble
N.AmentaG.D.Sjaardema
R.Tamassia
M.Gorkani
B.Aronov
S.LazardT.Roos
G.T.Wilfong
M.L.Demaine
J-M.Robert
T.J.Wilson
S.M.Robbins
R.Seidel
N.Katoh
G.Rote
J.Urrutia
J.S.Vitter
I.Streinu
L.Lopez-BuriekC.K.Johnson
F.Aurenhammer
S.Parker
J.Matousek
E.Sedgwick
J.O’Rourke
O.Devillers
J.Ashley
J.Hafner
C.Zelle
W.R.Oakes
W.Niblack
K.Mehlhorn
M.E.Houle
J.Hass
A.Hicks
Q.Huang
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
pS -core at level 46 of Computational Geometry
L.Guibas
M.Sharir
M.vanKreveld
B.Chazelle
J.Snoeyink
A.Garg
D.Dobkin
F.Preparata
J.Hershberger
C.Yap
J.Boissonnat
O.Schwarzkopf
J.Mitchell
M.Overmars
P.Gupta
R.Pollack
D.Eppstein
M.Goodrich
M.Bern
P.Agarwal
I.Tollis
H.Edelsbrunner
E.Arkin
R.Janardan
M.deBerg
D.Halperin
L.Vismara
M.Smid
G.Toussaint
M.Yvinec
M.Teillaud
S.Suri
R.Klein
E.Welzl
G.Liotta
J.Pach
P.Bose
J.Schwerdt
J.Majhi
J.Czyzowicz
R.Tamassia
B.AronovR.Seidel
J.Urrutia
J.Vitter
J.Matousek
C.Icking
J.O’Rourke
O.Devillers
G.diBattista
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Second collaboration network
The second collaboration network Cn = WAT ∗N
cnij =∑p∈W
bipnpj =∑
p∈N(i)∩N(j)
npj
cnij = contribution of author j to works, that (s)he wrote together with theauthor i .It holds
∑j∈A
∑j∈A
bipnpj = deg(p) and∑j∈A
cnij = deg(i)
cnii =∑
p∈N(i)
npi is the contribution of author i to his/her works.
Self-sufficiency: Si =cnii
deg(i)Collaborativness (co-authorship index): Ki = 1− Si
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
The ”best” authors in Statistics
name contrib pap self collab1. Burt R 83.716667 96 0.872049 0.1279512. Newman M 59.533333 87 0.684291 0.3157093. Doreian P 59.070408 75 0.787605 0.2123954. Bonacich P 45.416667 59 0.769774 0.2302265. Marsden P 41.000000 50 0.820000 0.1800006. White H 39.986111 51 0.784041 0.2159597. Wellman B 38.754762 57 0.679908 0.3200928. Friedkin N 36.333333 40 0.908333 0.0916679. Leydesdo L 34.533333 47 0.734752 0.265248
10. Borgatti S 30.469048 57 0.534545 0.46545511. Freeman L 30.250000 36 0.840278 0.15972212. Everett M 27.450000 45 0.610000 0.39000013. Litwin H 26.166667 32 0.817708 0.18229214. Snijders T 23.920408 42 0.569534 0.43046615. Skvoretz J 23.691667 39 0.607479 0.39252116. Breiger R 23.520408 30 0.784014 0.21598617. Krackhar D 22.031519 35 0.629472 0.37052818. Valente T 21.616667 44 0.491288 0.50871219. Barabasi A 18.755159 42 0.446551 0.55344920. Mizruchi M 18.333333 25 0.733333 0.26666721. Carley K 17.616667 35 0.503333 0.49666722. Cohen C 17.111111 32 0.534722 0.46527823. Moody J 16.916667 22 0.768939 0.23106124. Rothenbe R 16.492063 40 0.412302 0.58769825. Pattison P 16.483333 34 0.484804 0.51519626. Batagelj V 16.353741 29 0.563922 0.43607827. Lazega E 16.000000 20 0.800000 0.20000028. Latkin C 15.896032 49 0.324409 0.67559129. Wasserma S 15.803741 33 0.478901 0.52109930. Berkman L 15.767857 36 0.437996 0.562004
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Third collaboration network
The third collaboration network Ct = NT ∗Nctij = the total contribution of collaboration of authors i and jto works.
It holds ctij = ctji ,∑
i∈A∑
j∈A ctij = |W | and∑i∈A
∑j∈A npinpj = 1 – the total contribution of a complete
subgraph corresponding to the authors of a work is 1.∑j∈A
ctij =∑p∈W
npi is the total contribution of author i to works
from W .
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Components in SN5 cut at level 0.5
Network SN5 (2008): for "social network*" + most frequent references + around 100 social networkers;|W | = 193376, |C | = 7950, |A| = 75930, |J| = 14651, |K | = 29267
Borgatti_S
Cross_R
Jackson_M
Sparrowe_R
Carley_K
Galaskie_J
Wasserma_S
Holland_P
Leinhard_S
Newman_M
Knowlton_A
Schneide_J
Barabasi_A
Lin_N
Feld_S
Suitor_J
Wellman_BLaumann_E
Lind_P Herrmann_H
Albert_R
Gronlund_A
Holme_P
Watts_D
Johnson_C
Braha_D
Jolly_A
Bernard_H
Latkin_C
Marsden_P
Neaigus_A
Rothenbe_R
Weisner_C
Anderson_C
Jeong_H
Hawkins_J
Berkman_L
Fraser_M
Miller_M
Breiger_R
Krackhar_D
Kilduff_M
Seeman_T
Yang_H
Nowak_M
Sundquis_J
Girvan_M
Lambiott_R
Stauffer_D Leydesdo_L
Vandenbe_P
Chen_H
Potterat_J
Park_J
Mandell_W
Sherman_S
Bell_D
Atkinson_J
Bonacich_P Grabowsk_A
Batagelj_V
Newton_J
Faust_K
Ohtsuki_H
Weisbuch_G
Acock_A
Hampton_K
Doreian_P
Hummon_N
Keeling_M
Moore_C
Willer_D
Grundy_E
Ennett_S
Bauman_K
Farmer_T
Feiring_C
Xie_H
Litwin_H
Degenne_A
Flap_HSkyrms_B
Bar-Yam_Y
Davey-Ro_M
Lewis_M Ausloos_M
Mccarty_C
Pattison_P
Morris_M
Eames_KGhani_A
Hansson_L
Bjorkman_T
Cohen_C
Volker_B
Rodkin_P
Song_M
Miskel_C
Kretzsch_M
Wylie_J
Robins_G
Skvoretz_J
Garnett_G
Zenou_Y
Janssen_M
Killwort_P
Masuda_N
Jager_W
Bowling_A
Pillemer_K
Demeneze_M
Sneppen_K
Krause_N
Chou_K
Pinquart_M
Gastner_M
Pattie_C
Balkundi_P
Chi_I
Shelley_G
Woodard_K
Ostergre_P
Kogovsek_T
Carter_W
Everett_M
Ferligoj_A
Mrvar_A
Fararo_T
Hurlbert_J
Muth_S
Solomon_P
Fingerma_K
Birditt_K
Doak_S Assimako_D
Kosinski_R
Wallace_D
Sokolovs_J
Hanson_B
Bienenst_E
Rosvall_MCairns_B
Wallace_R
Hua_W
Foster_B
Calvo-Ar_A
Matzger_H
Shiovitz_S
Steinhau_H
Boyd_J
Ensel_W
Boyack_K
Xu_J
Sorensen_S
Vespigna_A
Banks_D
Rennemar_MKimura_M Saito_K
Seidman_S
Liden_R
Parker_A
Metzke_C
Farquhar_M
Konno_N
Franzen_A Hangartn_D
Johnston_R
Lindstro_D
Munoz-Fr_E
Holmes_D
Hagberg_B
Landau_R
Barer_BJohansso_S
Fiala_J Paulusma_D
Tang_J
Pastor-S_R
Iacobucc_D
Teresi_J
Pemantle_R
Shaw_B
Horner_RStark_F
Hopkins_N
Draine_J
Stolle_R
Browne_P
Lebeaux_M
Leinhard_
Brusco_M Steinley_D
Klavans_R
Borlund_P
Pajek
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Authors’ citations network
W
A
is
WA Ci
was,i
wat,j
A
j
W
t
WAT
cis,t
Ca = WAT ∗ Ci ∗WA is a network of citations betweenauthors. The weight w(i , j) counts the number of times a workauthored by i is citing a work authored by j .
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
Islands in SN5 authors citation network
Network SN5 (2008): for "social network*" + most frequent references + around 100 social networkers;|W | = 193376, |C | = 7950, |A| = 75930, |J| = 14651, |K | = 29267
UNKNOWN
BORGATTI_S
ROGERS_E
CARLEY_K
GALASKIE_J
GULATI_R
BURT_R
FREEMAN_L
WASSERMA_S
DOROGOVT_S
HOLLAND_P
LEINHARD_S
NEWMAN_M
KNOWLTON_A
VLAHOV_D
BARABASI_A
BARTHELE_M
COLEMAN_J
LIN_N
ROSS_N
JENKINS_R
LAUMANN_E
ALBERT_R
AMARAL_L
BOCCALET_S
GRONLUND_A
HOLME_P
WATTS_D
BRASS_D
MERTON_R
THOMPSON_J
WHITE_D
CELENTAN_D
CURTIS_R
DESJARLA_D
FRIEDMAN_S
GRANOVET_M
LATKIN_C
MARSDEN_P
NEAIGUS_A
ROTHENBE_R
VALENTE_T
KELLY_J
WEISNER_C
ANDERSON_C
JEONG_H
SNIJDERS_T
MILLER_M
KASKUTAS_L
BREIGER_R
KRACKHAR_D
WHITE_H
BROWN_G
KILDUFF_M
COSTA_L
MOLLOY_M
IBARRA_H
ADLER_P
GIRVAN_M
KLOVDAHL_A
POTTERAT_J
BRUGHA_T
MACCARTH_B
MAGLIANO_L
WING_J
COHEN_A
PARK_J
MANDELL_W
DEROSA_C
BONACICH_P
OSTROM_E
GRABOWSK_A
STROGATZ_S
BATAGELJ_V
FAUST_K
DISHION_T
DOREIAN_P
HUMMON_N
MOORE_C
WILLER_D
ATRAN_S
CAIRNS_RGEST_S
VANDUIJN_M
CRICK_N
ESPELAGE_D
FARMER_T
KINDERMA_T
LEUNG_MXIE_H
MIZRUCHI_M
HENDERSO_S
COOK_K
HUNT_J
BOORMAN_S
PATTISON_P
MORRIS_M
MORENO_Y
SCHWARTZ_N
FRIEDKIN_N
LAZEGA_E
MAGNUSSO_D
LAI_G
CARPENTE_S
PEARL_R
VANACKER_R
RODKIN_P
PELLEGRI_A
COIE_J
HYMEL_S
AMIRKHAN_Y
KRETZSCH_M
BEBBINGT_P
ROBINS_G
SKVORETZ_J
JANSSEN_MBREWIN_C
MASUDA_N
KIM_D
OLSSON_P
FOLKE_CHAHN_T
ADGER_WBERKES_F
GUNDERSO_L
HOLLING_C
SCHEFFER_M
WESTLEY_F
BALKUNDI_P
FARRELL_M
STRAUSS_D
EVERETT_M
FERLIGOJ_A
FARARO_T
HURLBERT_J
MUTH_S
SOLOMON_P
SEIKKULA_J
FIENBERG_S
HIGGINS_C
GAMBOA_G
BIENENST_E
ESTELL_D
CAIRNS_B
DARROW_W
WOODHOUS_D
GERGEN_K
MATZGER_H
DAPPORTO_L
PALAGI_E
JEANNE_R
RAU_P
REEVE_H
ROSELER_P
STARKS_P
STRASSMA_J
TURILLAZ_S
WESTEBER_M
WALKER_B
MELTZER_H
FIORILLO_AMALANGON_C
MAJ_M
MARKOVSK_B
MAHADEVA_R
SCHILLIN_C
MAY_P
MUSSAT_M
CORP_E
DELALAUR_L
DEPOMPER_M
DEYRIS_EFRANTZ_P
LEBEAU_E
LEMOIGNE_M
LEMOY_A
LEVOT_P
MAILLARD_J
MARECHAL_M
KILIC_C
AYDIN_I
TASKINTU_N
OZCURUME_G
KURT_G
EREN_E
LALE_T
OZEL_S
ZILELI_L
BASOGLU_M
MCGORRY_P
LEWIS_G
CADWALLA_T
AALTONEN_J
ALAKARE_B
ALANEN_Y
ANDERSEN_T
ANDERSON_H
FADDEN_G
SELVINIP
SHOTTER_J
DELUCCHI_K
FEDORA_P
HELD_T
LESAGE_A
IACOBUCC_D
MEDIN_D
GOFORTH_J
CLEMMER_J
SABORNIE_E
ABEL_E
LYNCH_E
MARASCO_C
GUARNERI_M
GOSSAGE_J
WHITE-CO_M
GOODHART_K
DECOTEAU_S
TRUJILLO_P
KALBERG_W
VILJOEN_D
HOYME_H
NIELSEN_R
NECKERMA_H
MORGAN_Z
VAPNARSK_V
EK_E
COLEY_J
TIMURA_C
BARAN_M
LESBAUPI_I
ARRUDA_M
BENJAMIN_C
BIONDI_ABOFF_C
GONCALVE_R
MATTOSO_J
PINAUD_J
STEDILE_J
TRINDADE_H
D’AMIA_G
AMATI_C
ANNONI_A
ARRIGONI_P
ASPARI_D
BECCARIA_G
BELGIOJO_A
BELTRAMI_L
BIANCONI_C
CASSIRAM_A
CATTANEO_CCHIZZOLI_G
DALLAJ_A
FRANCHET_G
GATTIPER_M
GIULINI_G
GOLDOLI_E
GOZZOLI_M
GUILINI_G
HONEGGER_A
KANNES_G
LATUADA_S
LUCCHELL_G
MERIGGI_M
MEZZANOT_G
MEZZANOT_P
MONTALTO_R
PAPAGNA_P
PATETTA_L
PIZZAGAL_F
REGGIORI_F
RICCI_G
ROMUSSI_C
ROSSI_M
SANDRI_M
SCOTTI_A
VACANI_C
VALLI_F
VERCELLO_V
ZOTTI_S
BUCHANAN_L
HOLLOWEL_J
GARIEPY_J
BROADBEL_L
MAVROVOU_M
BURGARD_A
FAMILI_I
VANDIEN_S
PFAENDTN_J
KLINKE_D
SUMATHI_R
Pajek
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
ESNA Pajek
Pajek – program for analysis and vi-sualization of large networks is freelyavailable, for noncommercial use, atits web site.
http://pajek.imfm.si/
An introduction to social networkanalysis with Pajek is available inthe book ESNA (de Nooy, Mrvar,Batagelj 2005). Second extendededition in September 2011.
ESNA in Japanese was publishedby Tokyo Denki University Press in2010; Chinese, November 2012.
Pajek 2.* → Pajek 3.*
V. Batagelj Networks from data bases
Networks fromdata bases
V. Batagelj
Two modenetworks
Multiplication
Derivednetworks
Pajek
References
Batagelj, V.: Social Network Analysis, Large-Scale. R.A. Meyers, ed.,Encyclopedia of Complexity and Systems Science, Springer 2009:8245-8265.
Batagelj, V, Cerinsek, M: On bibliographic networks. Scientometrics (2013).(DOI) 10.1007/s11192-012-0940-1.
Batagelj, V., Mrvar, A.: Analysis of Kinship Relations With Pajek. SocialScience Computer Review 26(2), 224-246, 2008.
The work was supported in part by the ARRS, Slovenia, grant P1-0294, as well asby grant N1-0011 within the EUROCORES Programme EUROGIGA (projectGReGAS) of the European Science Foundation.
http://pajek.imfm.si/lib/exe/fetch.php?media=pub:cns11.pdf
V. Batagelj Networks from data bases