84
1 Workshop on Basic Network Methods John Skvoretz Department of Sociology [email protected] June 2011 Aims Introduce COM researchers to the basic concepts of social network analysis Orient participants to the SNA packages UCINET and NetDraw Describe methods relevant to the medical research community

Workshop on Basic Network Methods

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Workshop on Basic Network Methods

1

Workshop on Basic Network Methods

John SkvoretzDepartment of [email protected]

June 2011

Aims

• Introduce COM researchers to the basic concepts of social network analysis

• Orient participants to the SNA packages UCINET and NetDraw

• Describe methods relevant to the medical research community

Page 2: Workshop on Basic Network Methods

2

Schedule

• The network perspective

• Navigating UCINET and NetDraw

• Data collection and entry

• Basic concepts of network analysis with examples using UCINET and NetDraw

• Workbook exercises, advanced topics, applications in the medical research community, other software packages for network analysisnetwork analysis

Warning …

Page 3: Workshop on Basic Network Methods

3

Examples

Friendships among researchers• Academics interested in interdisciplinary research (Freeman and

Freeman 1979) – friendship– Discipline– Citations

1= sociology2= anthropology3= mathematics/statistics4= other

Page 4: Workshop on Basic Network Methods

4

Contact between CTSA institutions

• Clinical & Translational Science Award institutions (Skvoretz 2009) – coattendance of institutional representatives at key function committee meetings of the CTSA Consortium– Cohort (color)( )– Tie strength (thickness of edge)

• Strong ties only

1C101C03

2C11

Contact between CTSA institutions

1C08

1C09

1C12

1C061C11

1C02

1C04

1C05

1C07

2C06 2C02

2C102C12

2C09

2C04

3C14

3C13

3C10

3C07

3C08

3C04

3C11

3C02

3C01

3C09

3C03

3C05

3C06

3C12

1C01

2C072C01

2C03

2C05

2C08

Page 5: Workshop on Basic Network Methods

5

Friendship in high school

Page 6: Workshop on Basic Network Methods

6

The network perspective

Page 7: Workshop on Basic Network Methods

7

The network perspective

• Mainstream social science analysis (MSSA) vs. social network analysis (SNA)

• Theoretical and methodological principles of network analysis

MSSA vs SNA

• Mainstream social science analysis

– Focuses on case outcomes as a function of case attributes

• Predict remission of chemical dependency patients as a function of patient attributes including care received

• Predict adoption of evidence based practices by hospital as a function of their values and orientation

– Data are organized in cases by variables format

– Identify one column as the outcome to be explained by Identify one column as the outcome to be explained by values in the other columns (attributes)

Page 8: Workshop on Basic Network Methods

8

MSSA vs SNA

• Social network analysis

– Shifts from attributes of cases to ties/relations between cases as explanatory factors

– Pairs/dyads of cases, not single cases, are the units of analysis

– Pairs/dyads interconnect to form networks

– Case outcomes as a function of

• the overall pattern of connection in the network • a case’s “position” in the overall pattern

SNA’s shift to relations between cases

• Theoretical consequences

– Structure matters – how groups are connected through networks makes a difference

– Position matters – how location in the pattern of connection determines a node’s opportunities and constraints

– Indirect connections matter – how your direct ties link you indirectly to strangers can have major impact

Page 9: Workshop on Basic Network Methods

9

SNA’s shift to relations between cases

• Methodological consequences –

– Data must be collected on ties as well as individual cases

– Each additional case = 1 more case but 2N more pairs of cases!

Relations between cases: measurement

From: Borgatti, S.B., A. Mehra, D.J. Brass & G. Labianca. 2009. “Network Analysis in the Social

Sciences.” Science 323:892-5.

Page 10: Workshop on Basic Network Methods

10

• Group cohesion

• Density

Properties of interest

• Path lengths

• Clustering and subgrouping

• Homophily – background attributes and clustering

• Assortative mixing

• Importance in overall pattern of connection

• Centrality

Properties of interest

• Activity

• Distance to others

• Value as an intermediary – “bridgeness”

• Connected to well-connected others

Page 11: Workshop on Basic Network Methods

11

• The local neighborhood

• Strength of ties

Properties of interest

• Reciprocated relations

• Closure vs structural holes – local clustering (or not)

• Acquaintance overlap

• Diversity of associates• Diversity of associates

• Organizations respond better to crisis when friendships cross department boundaries.

Hypotheses tested in the literature

• Degree centrality in a price fixing conspiracy network increases the likelihood of a guilty verdict.

• Firms with open collaboration networks (many structural holes) are less innovative.

• Adolescents with very large or very small friendship networks experience more depressive symptoms depending on their gender and the closure of their networkson their gender and the closure of their networks.

Page 12: Workshop on Basic Network Methods

12

• Supply-chain managers with open networks of discussion partners had better ideas to improve supply chain management.

Hypotheses tested in the literature

• Knowledge of the work of a colleague without direct contact

• depends on the number of paths of length 2 to that colleague.

• almost never occurs if the shortest path is length 3 regardless of the number of such paths.

• Interethnic marriages occur at much lower than chance levels but more frequently in more ethnically diverse populations.

• Individual adoption of a health behavior spreads farther and faster in clustered lattice networks than corresponding random networks.

Hypotheses tested in the literature

Page 13: Workshop on Basic Network Methods

13

Navigating UCINET and Navigating UCINET and NetDraw

UCINET shortcutsUCINET menusNetDraw shortcutsNetDraw shortcutsNet Draw menus

UCINET Main Screen: Shortcuts

Launch NetDraw

Launch Matrix Algebra

Display a dataset

Launch NotepadImport data via

Set defaultfolderDefault folder

pSpreadsheetSpreadsheet Editor

Page 14: Workshop on Basic Network Methods

14

UCINET Main Screen: Menus

Read/Write Data

Look at data

Manipulate datasets

Reshape data

UCINET Main Screen: Menus

Collapse network based on some information

Transform network by changing cell values

based on some mathematical operation

Transform adjacency matrix into other types

Page 15: Workshop on Basic Network Methods

15

UCINET Main Screen: Menus

Factor analysis type analytical tools

Utilities for calculating correlations and summarizing

distributions

Utilities for viewing partitions and associations between

variables

UCINET Main Screen: Menus

Finding subgroups and analyzing paths and density

Analyzing centrality and positional structures

Misc analyses of network ystructure

Two mode analyses

Page 16: Workshop on Basic Network Methods

16

NetDraw Main Screen: Shortcuts

Set node shape by attribute value

Draw using various algorithms

Open attribute file

Set node color by attribute value

Open file

Open network file

NetDraw Main Screen: Menus

Drawing layout choices

Utilities to shape a drawing

Page 17: Workshop on Basic Network Methods

17

NetDraw Main Screen: Menus

Subgroup analysis creates partition membership

attributes

Calculates centrality and other node-level properties and creates

associated nodal attributes

NetDraw Main Screen: Menus

Change properties of drawing elements

Page 18: Workshop on Basic Network Methods

18

My first data set ...

Page 19: Workshop on Basic Network Methods

19

Page 20: Workshop on Basic Network Methods

20

Page 21: Workshop on Basic Network Methods

21

Page 22: Workshop on Basic Network Methods

22

4= close friend (fiend?)3= friend2= person I’ve met1= person I’ve heard of, but not met0= person unknown

Page 23: Workshop on Basic Network Methods

23

Page 24: Workshop on Basic Network Methods

24

Page 25: Workshop on Basic Network Methods

25

Page 26: Workshop on Basic Network Methods

26

Page 27: Workshop on Basic Network Methods

27

Page 28: Workshop on Basic Network Methods

28

Page 29: Workshop on Basic Network Methods

29

Page 30: Workshop on Basic Network Methods

30

Page 31: Workshop on Basic Network Methods

31

Page 32: Workshop on Basic Network Methods

32

1= sociology2= anthropology3= mathematics/statistics4= other

Page 33: Workshop on Basic Network Methods

33

1= sociology2= anthropology3= mathematics/statistics4= other

1= sociology2= anthropology3= mathematics/statistics4= other

Page 34: Workshop on Basic Network Methods

34

Data collection and entry

Page 35: Workshop on Basic Network Methods

35

• Types of network data

• Complete – all ties linking elements of a closed

Data collection

population

• Ego network – set of ties surrounding a sampled unit

• One mode vs two mode

• One mode – ties are between nodes that are the same type of entity (person to person, organization to organization)to organization)

• Two mode – ties are between nodes of two different types of entities (person to event, RCT to disease)

• Methods

• survey & questionnaires (focus of data quality studies)

Data collection

• archives, especially recently electronic records

• observation

• diaries

• experiments

Page 36: Workshop on Basic Network Methods

36

• Basic issues

• Measure existing ties (behaviorist) or ties as perceived

Data collection

by actors in them (cognitive) – the dependent variable may matter (diffusion vs influence)

• Temporality – measure episodic contacts or routinized recurrent interactions – static bias, dynamic research must define when ties start, change, end

• Accuracy and reliability – precise description of ties composing a network (accuracy is the main concern) or composing a network (accuracy is the main concern) or indicators of conceptual variables (validity and reliability are main concerns)

• Design considerations

• complete required compound or indirect linkages

Data collection

important

• ego network ok if focus is on actor outcomes

• boundary specification – very important because omission is a big problem

• realist uses perception of actors, nominalist uses definition by observer (researcher)definition by observer (researcher)

• membership criteria of organization, social tie tracing in snowball sample, participation in set of events

• for ego networks, defined by name generator

Page 37: Workshop on Basic Network Methods

37

• Design considerations

• Sampling

Data collection

• not relevant for complete network studies

• random for ego networks means generalizations about egos can be made but not about dyads

• from a network – usable only to estimate some properties (like density or contact between subgroups)subgroups)

• Data sources

• Surveys and questionnaires self report

Data collection

• unaided recall

• complete roster

• dichotomous indicators vs intensity judgments

• name generators

• name interpreters

Page 38: Workshop on Basic Network Methods

38

• Data sources

• s&q self report

Data collection

• contacts with types of people (do you know a plumber?)

• ties between organizations from an informant

• archives – interlocking directorates, citations, trade, electronic records – big question is how such indirect indicators correspond to more direct indicators of indicators correspond to more direct indicators of interaction

• experiments – the small world

• Lessons learned

• do not constrain number of alters reported

Data collection

• roster provides more complete coverage than recall

• recall gets stronger ties

• can not give useful detail on detailed episodes or timing of interaction but good at general picture

• name interpreter data good for observable attributes • name interpreter data good for observable attributes, poor for attitudes/unobservables

• data on broad features of ties (duration, frequency) are good

Page 39: Workshop on Basic Network Methods

39

• Examples of instruments

• General Social Survey GSS

Data collection

• American National Election Studies ANES

• UC, Davis CTSA Community Engagement Survey UCD

• SelectSurvey SS

• Web of Science search WoS

• CTSA Key Function Committee minutes KFC

• Options

• DL files – text files, various formats available, easily created by any text editor

Getting data into UCINET

• VNA files – text files, native format for NETDRAW, enables combining both tie data and attribute data of nodes in a single file

• Data in native formats of other network software (Pajek)

• Raw data• Raw data

• Excel files

Page 40: Workshop on Basic Network Methods

40

dl n=20format=fullmatrixdata:0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

Common DL formats

0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 00 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0

dl n=28data:0 0 1 6 0 0 0 2 1 2 0 0 0 6 1 0 0 1 0 4 0 0 2 0 0 6 0 21 0 2 1 1 0 0 0 1 0 0 1 1 0 2 0 0 3 1 0 0 0 1 0 2 1 0 30 0 0 12 0 0 0 6 2 4 1 2 1 3 6 6 1 5 3 1 0 6 0 5 2 0 0 10 0 1 0 3 1 0 2 0 2 0 0 4 0 0 7 0 1 6 0 3 1 0 1 1 1 0 20 0 1 0 0 4 0 6 2 2 2 0 0 0 6 2 0 7 1 0 1 3 2 1 0 7 7 2

Common DL formats

0 0 0 0 0 0 2 7 1 5 8 3 1 1 0 1 1 2 1 0 2 1 3 0 0 3 4 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 2 2 9 2 3 3 0 0 1 4 1 1 1 1 1 0 0 5 4 5 0 0 0 0 0 0 0 0 0 1 0 2 2 0 0 2 0 2 0 0 0 1 0 1 1 0 0 10 0 0 0 0 0 0 0 0 0 2 1 2 0 2 0 1 3 1 0 0 0 4 1 0 0 1 20 0 0 0 0 1 0 0 0 0 0 1 1 0 2 2 0 3 3 0 1 1 1 1 0 1 4 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 1 6 0 1 1 1 2 4 0 0 3 1 0 20 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 3 1 0 2 3 1 1 1 1 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 1 1 0 1 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 5 0 2 0 1 4 6 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 5 0 2 0 1 4 6 10 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 1 0 1 0 1 1 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 2 0 0 0 0 1 00 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 4 0 0 0 0 1 2 5 2 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 4 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 7 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

Page 41: Workshop on Basic Network Methods

41

DLN=17FORMAT = FULLMATRIX DIAGONAL PRESENTROW LABELS:1234567

Common DL formats

891011121314151617COLUMN LABELS:12345678891011121314151617DATA:0 1 1 0 0 0 1 1 0 1 0 0 0 0 1 0 01 0 1 0 0 0 0 1 0 1 1 0 0 0 1 0 00 1 0 0 0 0 1 1 0 1 0 0 0 1 0 1 00 0 1 0 0 0 0 1 0 1 1 0 0 1 1 0 00 0 1 0 0 0 1 0 0 1 0 1 0 0 1 1 00 1 0 0 1 0 1 0 0 1 1 0 0 0 0 1 0…

dl nr=38 nc=5row labels embeddedcol labels embeddeddata:E20070129 E20070411 E20071113 E20080208 E20081108 Columbia 1 1 1 5 2Duke 1 0 0 1 1

Common DL formats

Mayo 1 0 1 1 1OHSU 2 1 1 2 1Rockefeller 1 1 1 1 1UCDavis 0 1 1 1 1UCSF 1 1 1 1 1UPenn 1 1 0 1 0Pittsburgh 1 1 2 2 1Rochester 1 1 0 2 1UTHSC 1 1 1 1 1Yale 1 1 0 2 2CaseWestern 0 0 2 1 0Emory 0 0 2 3 2J h H ki 0 0 1 3 2JohnsHopkins 0 0 1 3 2Chicago 0 0 1 1 1Iowa 0 0 1 1 2UMichigan 0 0 1 0 0Dallas 0 0 0 3 1UWashington 0 0 2 3 1UW-Madison 0 0 1 2 2Vanderbilt 0 0 1 1 0...

Page 42: Workshop on Basic Network Methods

42

dl n = 43 format = edgelist1labels:JohnSkvoretz,SteveBorgatti,DaliaColon,JamesCavendish,AnnaMarieKoehler-Shepley,HarisMemic,ElisaBellotti,TracyBurkett,IlanTalmud,AndySnider,JayA'Hern,KatrienCleemput,GuidoConaldi,DavidLazer,Pooya????,ElizabethVaquera,ElsaOntiveros,Anne-MarieNiekamp,TonyImhof,RebeccaThys,ToreOpsahl,GeertjanVries,LukasZenk,AdrienneK

Common DL formats

p, y , y , p , j , ,insella,NadineKegen,JuliaBrennecke,BrianaHall,JaneFountain,LuisLoredo,FilipAgneessens,BruceCochrane,GretchenKoehler,JonathanSkvoretz,JosRitter,BobbyBrame,JulieVinup,LeaEllwardt,JosieMcLeod,ThomasFriemel,JenniferStortz,MistySkvoretz,IreneTroy,BenjaminElbirtlabels embedded:data:JohnSkvoretz,SteveBorgattiJohnSkvoretz,DaliaColonJohnSkvoretz,JamesCavendishJohnSkvoretz,AnnaMarieKoehler-ShepleyJohnSkvoretz,HarisMemicJohnSkvoretz,TracyBurkettJ h Sk t Il T l dJohnSkvoretz,IlanTalmud…SteveBorgatti,ElisaBellottiSteveBorgatti,IlanTalmudSteveBorgatti,DavidLazerSteveBorgatti,BenjaminElbirtJamesCavendish,ElizabethVaqueraBrianaHall,AnnaMarieKoehler-ShepleyGretchenKoehler,AnnaMarieKoehler-ShepleyJosRitter,AnnaMarieKoehler-ShepleyKatrienCleemput,HarisMemic…

dl n = 73, format = nodelistlabels:1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73

Common DL formats

, , , , ,data:1 14 15 21 54 55 2 21 223 9 154 5 18 19 435 19 436 13 20 227 178 14 179 12 20 21 22 51

11 19 50 52 5312 20 21 2213 17 20 21 2213 17 20 21 2214 21 2215 2016 18 41 4317 7 818 11 16 1919 4 11 16 18 2720 6 12 21 22 3821 22 51 54 5522 20 21 38 5123 40 43 50 52 53 60 62 65 68…

Page 43: Workshop on Basic Network Methods

43

• Can combine in one file both nodal attribute data and network relational data

• Values of attributes can be text, unlike UCINET in which l t b b h UCINET i t fil

VNA files

values must be numbers – when UCINET imports a vna file it converts text values to numerical values

• Example 1

• Example 2

• Copy and paste into spreadsheet utility

• Example

Excel files

Page 44: Workshop on Basic Network Methods

44

• For small networks or small attribute files, use spreadsheet utility or Excel

• If keeping text as values of attributes is important,

Advice

create a vna file with just node data in it – to use it in UCINET analysis, export from NetDraw

• If network is large, create edgelist or nodelist DL file for the network

• Medical innovation study

• Codebook• Data

Examples

• UCINET file(networks), file(attributes)

• RCT study

• Data

• Workshop study

Raw data• Raw data

Page 45: Workshop on Basic Network Methods

45

Basic concepts: graphs and Basic concepts: graphs and matrices

Basics

• Graph – a set of items called vertices/nodes with undirected connections between them, called edges

• Digraph – a set of vertices/nodes with directed connections between them, called arcs

• More complicated types

– More than one type of connection

– More than one type of vertex (two mode networks)

– Edges/arcs may carry weights

– Multiple timepoints

Page 46: Workshop on Basic Network Methods

46

Basics

• Sociogram – a graph or digraph representing the ties among individuals in a population

• Sociomatrix – a square table or matrix representing the location of ties between individuals in a population

• One row per person, one column per person, rows ordered from top to bottom and columns from left to right in the same order

E t i th ith d th jth l i th ti i di t • Entry in the ith row and the jth column xij is the tie indicator for the i,j pair – in the simple case 1 means a tie is present 0 means it is absent

Basics

sociogram

Page 47: Workshop on Basic Network Methods

47

Basics

sociomatrix

Basics

2C06 2C02

2C07

2C01

2C10

2C12

2C032C05

2C09

2C11

2C08

2C04

3C14

3C07

3C08

3C113C09

3C12

1C08

1C091C10

1C11

1C05

2C01 2C11

3C133C10

3C04

3C11

3C02

3C013C03

3C053C06

sociogram

1C01C08

1C12

1C06

1C03

1C11

1C02

1C041C07

Page 48: Workshop on Basic Network Methods

48

Basics

sociomatrix

Basics

• Adjacency – node i is adjacent to node j if xij = 1

• Dyad – a pair of nodes and the possible ties among themDyad a pair of nodes and the possible ties among them

• In graphs:

• In digraphs:

• Triad – a triple of nodes and the possible ties among them

• In digraphs, there are 16 different triad types

Page 49: Workshop on Basic Network Methods

49

Basics

• Path – a sequence of adjacent nodes in which all nodes and edges are distinct

• path length is the number of edges or arcs (simple paths ignore direction of arcs)

• Geodesic – the shortest path between two nodes

• Component – a set of vertices such that a path exists between any two nodes in the set

• a graph with more than one component is disconnected• isolates are components

Basics

• Cutpoint – a node that if removed would increase the number of components

• Bridge – a tie that if removed would increase the number of components

• Local bridge – a tie that if removed would increase the length of the shortest path connecting the two nodes to at least length three

Page 50: Workshop on Basic Network Methods

50

Basics

Bridge

Cutpoint

Local Bridge

Basic concepts: simple Basic concepts: simple network-level properties

Page 51: Workshop on Basic Network Methods

51

Network-level properties

• Density (Δ) – ratio of number of edges (arcs) present to the maximum number possible

( )1N N• Max edges =

• Max arcs =

• Often used as a measure of cohesion

• Average number of ties per person =

( )12

N N −

( )1N N −

( )1NΔ −g p p

• In large populations, density must be small, otherwise average number of ties is huge

( )

Network-level properties

• Reachability or connectivity– proportion of pairs connected by a path of finite length

• Fragmentation – proportion of pairs not connected by a path of finite length

• Average geodesic – average length of shortest path among connected pairs

Page 52: Workshop on Basic Network Methods

52

Network-level properties

Max edges = 32x31/2 = 496Edges present = 43

Density = 43/496 = 0.087

Ave edges per node = 2.69Reachability = 0.61

Fragmentation = 0.39Ave geodesic = 3.04

Network-level properties

Page 53: Workshop on Basic Network Methods

53

Network-level properties

Routine counts arcs as ties and each edge equals 2 arcs, therefore 2 ties.

Network-level properties

Page 54: Workshop on Basic Network Methods

54

Network-level properties

Network-level properties

Page 55: Workshop on Basic Network Methods

55

Network-level properties

Basic concepts: node level Basic concepts: node-level properties

Page 56: Workshop on Basic Network Methods

56

Node-level properties

• Key attribute of the “position” of a node in the overall pattern of connection – its importance or centrality

• Degree• Closeness• Betweenness• EV centrality

• The “position” of a node in its local neighborhood

• Closure/clustering among contacts

Node level properties: Node-level properties: centrality

Page 57: Workshop on Basic Network Methods

57

Node-level properties: centrality

• Who is more important, more central?

• Participants in a summer methods camp by gender (color), role (shape) – at least one chooses the other as being in his/her top 3 “had most contact with”

Node-level properties: centrality

• Important means being involved, being active

• Number of direct ties or degree – degree centralityNumber of direct ties or degree degree centrality

• More active nodes are more important nodes

( )

( )

+=

= = =

′ =−

∑1

raw:

normed: 1

N

D i i ijj

iD

C i d x x

dC i

N

• Correlates with opportunity to directly influence and be influenced, visibility, exposure to network flows

Page 58: Workshop on Basic Network Methods

58

Node-level properties: centrality

• Important means being close to others, having short paths to them if not direct connections – closeness centrality

1

• More important nodes have short paths to many others

• Correlates with ability to reach all others (as sender), be

( )

( ) ( ) ( )=

=

′ = −

∑1

1raw: =geodesic distance from to

normed: 1

FC ijN

ijj

FC FC

C i g i jg

C i N C i

y ( ),reached by all others (as receiver)

Node-level properties: centrality

• Important means being on the shortest paths between pairs of others – betweenness centrality

( )g i

• More important nodes are on the shortest paths between many pairs of others

( ) ( )

( ) ( )( ) ( )

< ≠

=

′ =− −

∑raw:

2normed:

1 2

jkB

j k i jk

BB

g iC i

g

C iC i

N N

• Correlates with opportunity to broker relations of others, control flows, have cosmopolitan viewpoint and access to diverse data

Page 59: Workshop on Basic Network Methods

59

Node-level properties: centrality

• Important means being connected to others who are important – EV (eigenvector) centrality

• Important nodes are connected to other important nodes –it is not whom you know but whom those you know know

• Correlates with opportunity for indirect influence, “behind the scenes” power

Node-level properties: centrality

• Degree• Steve• Michael• Holly• Pam • Pauline

Page 60: Workshop on Basic Network Methods

60

Node-level properties: centrality

• Closeness• Michael• Gery• Holly• John• Russ• Pauline

Node-level properties: centrality

• Betweenness• Gery• Michael

P li• Pauline• John• Holly

Page 61: Workshop on Basic Network Methods

61

Node-level properties: centrality

• EV• Holly• Michael

H• Harry• Don• Pam

Node-level properties: centrality

Page 62: Workshop on Basic Network Methods

62

Node-level properties: centrality

Node-level properties: centrality

Page 63: Workshop on Basic Network Methods

63

Node-level properties: centrality

Node-level properties: centrality

• Measures are usually positively correlated so inconsistent profiles are especially interesting

Degree Closeness Betw’ness EV

Degree 1.000 0.629 0.626 0.625Closeness 1.000 0.839 0.630Betw’ness 1.000 0.289EV 1.000

Page 64: Workshop on Basic Network Methods

64

Node level properties: Node-level properties: clustering

Node-level properties: clustering

• Opportunity vs constraint

• Bridging vs bonding social capitalBridging vs bonding social capital

• Social control and social support

Closure Closed Open

Structural Holes Few Many

Page 65: Workshop on Basic Network Methods

65

Node-level properties: clustering

• Clustering coefficient – density of ties among ego’s alters– Effective size – Efficiency– Constraint

Clustering Coeff

1.000 0.267

Effective Size 3.000 4.667

Efficiency 0.600 0.778Clustering Coeff 0.900 0.100

Constraint 0.360Effective Size 1.333 4.600

Efficiency 0.267 0.920

Constraint 0.642 0.300

Node-level properties: clustering

• Clustering coefficient• Brazey• Lee

Bill• Bill• …• Pat

Page 66: Workshop on Basic Network Methods

66

Node-level properties: clustering

Node-level properties: clustering

Page 67: Workshop on Basic Network Methods

67

Node-level properties: clustering

• Aggregation to a network-level property – definitional to small worldness

– High average clustering – much higher than in random networks of similar size and density

– Short path lengths – on the order of lengths typical of random networks of similar size and density

Basic concepts: cohesive Basic concepts: cohesive subgroup identification

Page 68: Workshop on Basic Network Methods

68

Cohesive subgroup identification

• Look for subgroups that “hang together”

– Important emergent phenomena such as “communities of Important emergent phenomena such as communities of practice”

– Interesting relationships to node attributes and characteristics (gender, scientific field)

– Effect on capacity for collective action by the group

– Locus of important social processes (influence, trust, social support)

Cohesive subgroup identification

• Of many methods proposed, consider two

– Direct connections are crucial – the Luce-Perry cliqueDirect connections are crucial the Luce Perry clique

• Emphasizes how nodes in a subgroup are directly connected to each other

– Ties or links with high “betweenness” are crucial – the boundaries between Girvan-Newman communities

• Emphasizes how nodes in a subgroup are indirectly connected to nodes p g p yin other subgroups

Page 69: Workshop on Basic Network Methods

69

Cohesive subgroup identification

• A clique is a maximally complete subgraph of three or more nodes

• All nodes are adjacent to one another and no other node is adjacent to all in the subgraph

• Stand alone connected dyads not considered cliques

• Very strict definition of cohesion

• Nodes may belong to more than one clique – cliques are not necessarily mutually exclusive subgroups but may not necessarily mutually exclusive subgroups, but may overlap

Cohesive subgroup identification

• Ten cliques – three of size 4; seven of size 3

Page 70: Workshop on Basic Network Methods

70

Cohesive subgroup identification

Cohesive subgroup identification

Page 71: Workshop on Basic Network Methods

71

Cohesive subgroup identification

Cohesive subgroup identification

• Communities are subgraphs connected to other communities by high betweenness edges

• A high betweenness edge is on many short paths between pairs of nodes

• To identify communities, successively delete the edge with the highest betweenness score

• Recalculate scores, delete the highest edge and continue until target number communities achieved

• Yields non-overlapping mutually exclusive subgroups

Page 72: Workshop on Basic Network Methods

72

Cohesive subgroup identification

• Communities with 2, 3, and 4 clusters

Cohesive subgroup identification

Page 73: Workshop on Basic Network Methods

73

Cohesive subgroup identification

Cohesive subgroup identification

• Cliques

• Sometimes too many and too overlapping – analysis of overlap useful

• Very few in sparse graphs yet there may be regions of greater density

• No interesting substructure possible in a clique

• Communities

• Time consuming to compute for large graphs

• Identified groups may not be especially well connected

Page 74: Workshop on Basic Network Methods

74

Basic concepts: position Basic concepts: position identification

Position identification

• Look for sets of nodes/persons who are connected to others in very similar/identical ways, regardless of their direct or indirect ties to one another

– Positions defined by such sets of persons

– Positions are emergent from the structure of relations

– Persons in the same position are “structurally equivalent”

– Important b/c positions occupied rather than subgroups belonged to can affect outcomes experienced by individuals

Page 75: Workshop on Basic Network Methods

75

Position identification

• Three flavors of equivalence

– Regular – nodes are regularly equivalent if they are equally Regular nodes are regularly equivalent if they are equally tied to equivalent others

– Automorphic – nodes are automorphically equivalent if the only thing distinguishing them are their labels

– Structural – nodes are structurally equivalent if they have ties to exactly the same set of others

Position identification

• “Thinking” graph– Automorphic– StructuralStructural

Page 76: Workshop on Basic Network Methods

76

Position identification

• Automorphic is very computationally intensive – useless on large graphs

• “Perfect” structural equivalence seldom found

• Calculate a measure of equivalence (Euclidean distance between choices given and choices received)

• Define a cutoff score below which pairs of nodes are considered s e

( ) ( )≠

⎡ ⎤= − + −⎢ ⎥⎣ ⎦∑2 2

,ij ik jk ki kj

k i j

d x x x x

considered s.e.

• Use cluster analysis to place nodes in s.e. clusters

Position identification

• Four position solution for CTSA centers– Cut = 94.5– Max = 214.5Max 214.5– Min = 9.8

Page 77: Workshop on Basic Network Methods

77

Position identification

Position identification

Page 78: Workshop on Basic Network Methods

78

Position identification

Basic concepts: Basic concepts: compositional effects

Page 79: Workshop on Basic Network Methods

79

Compositional effects

• Exogenous attributes of nodes and network ties

– How are the ties between nodes related to exogenous How are the ties between nodes related to exogenous attributes of the nodes?

• Discipline in the case of researchers• Award cohort in the case of medical centers

– Homophily of ties – the extent to which ties are between nodes of similar background at greater than chance levels

– Contact diversity of person – the extent to which a person’s ties are to others of diverse background

Compositional effects

• Friendships and discipline

1 2 3 4

1 40 3 5 5

2 3 8 2 2

3 5 2 2 1

4 5 2 1 0

Page 80: Workshop on Basic Network Methods

80

Compositional effects

• But persons vary

Discipline N Friends Homophilous

1 1 4 0.500

2 2 7 0.429

3 4 2 0.000

6 1 1 1.000

42 1 3 1.000

43 2 1 0.000

44 4 3 1.000

Compositional effects

• Measures for binary data

– E-I index varies from -1 (total homophily) to +1 (total E I index varies from 1 (total homophily) to +1 (total heterophily

– Proportion homophilous

– Heterogeneity of associates

−External Internal

Total

N NN

g y

qk(i) = the proportion of i’s associates in the kth

category of an attribute

( )− ∑ 21 kk

q i

Page 81: Workshop on Basic Network Methods

81

Compositional effects

Compositional effects

Page 82: Workshop on Basic Network Methods

82

Compositional effects

Compositional effects

Page 83: Workshop on Basic Network Methods

83

Compositional effects

Compositional effects

Page 84: Workshop on Basic Network Methods

84

• The network perspective

• Navigating UCINET and NetDraw

Summary

• Data collection and data entry

• Basic concepts

• Graphs and matrices• Simple network-level properties• Node-level properties: centrality• Node level properties: clustering• Node-level properties: clustering• Cohesive subgroup identification• Position identification• Compositional effects

Workbook exercises, advanced topics, applications in the medical research in the medical research community, other software packages for network analysis