27
Sunbelt XXIV, Portorož, 2 004 1 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Embed Size (px)

Citation preview

Page 1: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 1

Pajek Workshop

Vladimir BatageljAndrej Mrvar

Wouter de Nooy

Page 2: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 2

Today’s Program• Introduction to Pajek and social

network analysis

• Analysing large networks with Pajekand fine-tuning layouts

• Discussion and questions

Page 3: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 3

PART 1

Exploratory Network Analysis

with Pajek

(Published at Cambridge University Press, October 2004)

W. de Nooy, A. Mrvar, V. Batagelj

ž

Page 4: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 4

Overview• Network data• Vertex attributes

and properties • Cohesive

subgroups:– in simple networks– in signed networks– in valued networks

• Brokerage:– centrality– structural holes– brokerage roles

• Ranking:– prestige– acyclic networks

• Blockmodeling• Networks and time

– repeated measurement

– diffusion– genealogies, citations

• Network analysis and statistics

• Building your own

Page 5: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 5

Network data• Opening a network in Pajek• Drawing a network in Pajek

– Energizing the layout– Selecting display options– Exporting the sociogram

• Pajek network data– Structure– Store & export from Access

• Example: World trade relations– Imports_manufactures.net

Page 6: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 6

Vertex attributes and structural properties

• Types of data objects– Partitions: discrete properties– Clusters: 1 class from a partition– Vectors: continuous (numeric) properties– Hierarchies: nested classification– Permutations: reordering (renumbering)

• Visualizing partitions and vectors• Menu structure• Pajek project file

Page 7: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 7

Cohesive subgroupsin simple networks

• Connectivity• Example: Attiro.paj• Measures:

– Components: weak and strong– k-cores– Cliques, complete subnetworks

• Analytic strategy

Page 8: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 8

1. Is the netw ork d irected? In fo>N etw ork>G eneral

2b. F ind w eak com ponents

D o c

N et>C om ponents>W eak

om ponents identify subgroups?

2a. F ind com ponents

D o com ponents identify subgroups?

N et>C om ponents>W eak

Fin ish: subgroupsnot found.

3a. F ind -coresk

conta in subgroups?D o k-cores

N et>Partitions>C ore>Input

4a. R em ove vertices of the low est -coresk O perations>Extract from N etw ork>Partition

Fin ish: subgroupsare classes in the

com ponentspartition.

5.5 F ind com ponents (see 2a.)

5a. F ind overlapping com plete subnetw orks

D o subnetw orks identify subgroups?

Select and execute

N ets>Find Fragm ent (1 in 2) >O ptions>Extract Subnetw ork

N ets>Find Fragm ent (1 in 2)>F ind

3b. F ind strong com ponents N et>C om ponents>Strong

D o com ponents identify subgroups?

5b. Sym m etrize the netw orkN et>Transform >Arcs->Edges>A ll

no yes

yes yes

nono

yesno

yes

no

yes

yes

no

4b. F ind overlapping com plete subnetw orks Select

and execute

N ets>Find Fragm ent (1 in 2) >O ptions>Extract Subnetw ork

N ets>Find Fragm ent (1 in 2)>F ind

D o subnetw orks identify subgroups?

no

Page 9: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 9

Cohesive subgroups in signed networks

• Balanced clusters• Example: Sampson.paj• Using line values & signs in layout• Optimization approach

– Set parameters– Search optimal solution– Repeat many times

• Stepping through partitions

Page 10: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 10

Cohesive subgroupsin valued networks

• Cohesion by strong or multiple ties• Example: interlocking directorates

in Scottish banking (circa 1900) Scotland.paj

• Transform 2-mode into 1-mode network

• Measure:– m-core (valued core)

• SVG output

Page 11: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 11

Centrality• Centrality and centralization• Undirected networks (Knoke & Burt,

1983)

• Example: Strike.paj– Degree– Closeness– Betweenness

Page 12: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 12

Brokerage• The flow of information• Example: Strike.paj• Overall network structure:

– Bridges– Cut-vertices or articulation points– Bi-components

• Investigating the ego-network:– Structural holes– Brokerage roles

Page 13: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 13

5 Brokerage roles

v

u w

coord inator

v

u w

itinerant broker

v

u w

lia ison

v

u w

gatekeeper

v

u w

representative

Page 14: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 14

Prestige• Asymmetric choices• Example: SanJuanSur2.paj• Measures:

– Popularity: indegree– Input domain: direct and indirect

nominations– Proximity prestige: size of domain

divided by the average distance within the domain

• Structural and social prestige

Page 15: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 15

Ranks: acyclic networks• Discrete ranks or levels• Example: student_government.paj• Local network structure:

– Triadic analysis and the triad census

• Overall network structure:– Strong components and ranks– Symmetric-acyclic decomposition

Page 16: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 16

Balance-theoretic modelsModel Ties within a cluster Ties between ranks Permitted triads

Balance symmetric ties within a

cluster, no ties between

clusters; max 2 clusters

none 102, 300

Clusterability idem

no restriction on the

number of clusters

idem + 003

Ranked

Clusters

idem asymmetric ties from each

vertex to all vertices on

higher ranks

+ 021D, 021U,

030T, 120D, 120U

Transitivity idem null ties may occur

between ranks

+ 012

Hierarchical

Clusters

asymmetric ties within a

cluster allowed provided

that they are acyclic

idem + 120C, 210

no balance-theoretic model (‘forbidden’) 021C, 111D,

111U, 030C, 201

Page 17: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 17

Triad types and models

1 - 003

2 - 012

3 - 102

4 - 021D

5 - 021U

9 - 030T

12 - 120D

13 - 120U

14 - 120C

15 - 210

16 - 300

Balance C lusterability

Transitiv ity H ierarchica lC lusters

R anked C lusters

Page 18: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 18

Blockmodeling• Matrix and permutation for visualization• Blockmodel

– Partition of vertices into classes (positions)– Image matrix of relations among blocks

• Types of blockmodels– Cohesive subgroups– Center-periphery structure– Ranks

• Types of equivalence:– Structural equivalence: hierarchical clustering– Regular equivalence

Page 19: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 19

Cohesive subgroupsD om ingo

C arlosA le jandroEduardo

FrankH al

KarlBobIkeG ill

LannyM ikeJohn

XavierU trecht

N ormR ussQ uint

W endleO zzie

TedSamVernPaul

Page 20: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 20

Image matrix

Class SpanishEnglish –

youngEnglish –

old

Spanish Complete Empty Empty

English – young

Empty Complete Empty

English – old

Empty Empty Complete

Page 21: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 21

Blockmodel types

C ohesion C enter-periphery R anking

Im age m atrix

Page 22: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 22

Regular equivalence and errors

pm inister

m in isters

m inister2

m inister3

m inister4

m inister5

m inister6

m inister7

advisor1

advisor2

advisor3

pm inister advisors

m inister1

X

X

XX

X

X

X

X

Page 23: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 23

Networks and time• Longitudinal network: a network

measured at different time points– Example: Sampson.paj

• Diffusion: vertex property changing over time, e.g., adoption– Example: ModMath.paj

• Descent: a relation spanning time– Genealogies: descent by birth; structural

relinking– Citations: descent of ideas; main path analysis– Example: Gondola_Petrus.ged,

centrality_literature.paj

Page 24: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 24

Genealogies• Data format: GEDCOM 5.5 standard

www.gendex.com/gedcom55/55gcint.htm

• Software:- Genealogical Information Manager www.mind spring.com/~dblaine/gim home.html- Personal Ancestral File www.familysearch.org

Page 25: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 25

Networks and statistics• Statistical relations among properties of

vertices: partitions and vectors• Example: social and structural prestige

(Ch. 9)• In Pajek: discrete (Cramer’s V, Rajski,

rank correlation) and continuous (Pearson correlation, regression)

• Pajek to R: see afternoon session• Pajek to other statistics software: paste

numbers from partition or vector into statistics software datasheet

Page 26: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 26

Building your own• Macro: sequences of commands

performed on selected data objects

• Example: exposure in a diffusion network

• Macro commands:– Record– Add message: add comment– Play

Page 27: Sunbelt XXIV, Portorož, 20041 Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy

Sunbelt XXIV, Portorož, 2004 27

Relations among chapters

C h.1 - Looking for socia l structure

C h.2 - A ttributes and re la tions

C h.3 - C ohesive subgroups

C h.4 - Sentim ents and friendship

C h.5 - A ffilia tions

C h.6 - C enter and periphery

C h.7 - B rokers and bridges

C h.8 - D iffusion

C h.9 - P restige

C h.10 - R anking

C h.11 - G enealogies and cita tions

C h.12 - B lockm odels