ICPSR - Complex Systems Models in the Social Sciences - Lab Session 9 - Professor Daniel Martin Katz

Preview:

Citation preview

Professor Daniel Martin Katz !

Introduction to Computing !for Complex Systems !

(Lab Session 9)!

Social Networks & !the Tools of Analysis!

!

Pajek: It’s Not Everything, !But it’s a Good Start!

•  Today we will begin to use Pajek, (which is pronounced Pah-yek). !

•  Pajek means spider in the Slovenian Language!

•  It is designed to read fairly large networks. !

!

•  Pajek allows you to:!

•  Read and visualize network data!

•  Edit and create networks !

•  Run Various node level statistics!

•  Run Various graph level statistics!

Pajek: It’s Not Everything, !But it’s a Good Start!

More Info About Pajek!

Vladimir Batagelj and Andrej Mrvar! !!Pajek: Program for Analysis and!Visualization of Large Networks.!Reference Manual !!!Version 1.27. Ljubljana, 2010.!

http://vlado.fmf.uni-lj.si/pub/networks/pajek/doc/pajekman.pdf!

More Info About Pajek!

Wouter de Nooy, Andrej Mrvar, and Vladimir Batagelj!

!!Exploratory Social Network

Analysis with Pajek. !!!Cambridge University Press,

2005.!

More Info About Pajek!

For a detailed!

description of!

Pajek’s menu !

bar options: !

http://vlado.fmf.uni-lj.si/pub/networks/pajek/sunbelt.97/pajekman.htm!

Creating Networks!

•  Pajek can read your network data files!

•  Pajek can also edit networks as well as create random graphs (which can serve as a null case).!

•  Please Open Pajek on your machine!

!

Random Network Generation!

•  create an Erdos-Renyi random graph!

•  Net>Random Network >Erdos-Renyi> Undirected>General>…!

!

!

!

Random Network Generation!

•  >Erdos-Renyi>Undirected>General>!

!

!

!

•  How many vertices: 100!

•  Average degree of vertices: 5!

Random Network Generation!

•  The screen should now show a Report screen that will show what Pajek has done thus far.!

Exploring Pajek’s !Menu Options!

•  Pajek will keep the networks you have used during your session in this drop down menu.!

•  Partitions keep discrete categorical attributes of nodes (such as Degree, Party ID, Etc.).!

Partition for!

Republicans =1!

Partition for!

Democrats =2!

Partition for!

Independent =0!

Exploring Pajek’s !Menu Options!

•  Vectors keep continuous node attributes (such as centrality).!

!

•  The Permutations, Cluster, Hierarchy drop down menus keep different types of clustering attributes.!

Graph Visualization!

•  Let’s visualize our Random Graph!

!

•  Select in the top menu Draw and then press Draw in the drop down menu. Draw>Draw.!

Energizing the Network!

•  Go to Layout in the top menu of the visualization screen and select an energizing algorithm!

!

•  Layout > Energy > Kamada-Kawai > Free!

Energized Random Network!

Energized Random Network!

•  Go to Layout again:!

!

•  Layout > Energy > Fruchterman > 2D!

•  Layout > Energy > Fruchterman > 3D!

Energized Random Network!

Rotation of the Network!

(1) Go to Layout again:!–  Layout > Energy > Fruchterman > 3D!

(2) Spin > Spin Around !

!

(3) Select Number of !

Degrees to Rotate!

"(Try 1080°)!

The Options Menu!

•  Lets Explore the Options SubMenu!

!

•  We can change turn on the node labels, numbers, vector values, etc. !

!

The !Options Menu!

Same Visualization- now with node labels!

!

Node labels could !

be names, Firms, etc.! !

Daniel Katz & Derek K. Stafford, Hustle and Flow: A Social Network analysis of the American Federal Judiciary, 71 Ohio State L. J. 457 (2010)!

The Options Menu!

•  Lets Explore the Options SubMenu!

!

•  We can change the size of the vertices !

!

Old and New Node Sizes!

The Options Menu!

Change Node and Edge Colors!

!

Set Nodes = Blue!

Set Edges = Yellow!

!

Blue and Yellow!

The Options Menu!

Change the Background Color !

Change Font Color for Vertex Labels !

!

Set Background = Black!

Set Font Color = Grey!

!

Blue and Yellow: Take 2!

How Do I Make This !Image Crisper?!

(1) Export to .SVG!

(2) Download Inkscape for Post Production !

http://www.inkscape.org/download/!

Inkscape has lots !of functions!

Before Inkscape!

Note : Some nodes will move as separate Realizations of the !visualization algorithm lead to slightly variant results!

After Inkscape!

Daniel Katz, Joshua Gubler, Jon Zelner, Michael Bommarito, Eric Provins & Eitan Ingall, Reproduction of Hierarchy? A Social Network Analysis of the American Law Professoriate, !

61 Journal of Legal Education 1 (2011)!

Okay, Lets

Generate Some!Graph level Stats!

•  Using Our Random Graph We Can Measure the Clustering Coefficient of the Resulting Network !

!•  Network>Vector>Clustering Coeffcients>CC1!

Double Click inside the vector menu!to get clustering coefficients for!individual nodes!!Here is what it will look like !(values may differ)!

!!

Empirical Network Data!

•  We have now learned how to create random graphs and visualize networks. !

!

•  It is now time to work with a real empirical data set.!

!

!

Corporate Interlocks in Scotland dataset !

•  The Scotland.net file within has a dataset of a, “two-mode network with 244 vertices (136 multiple directors and 108 companies), 356 edges (directorate), no arcs, no loops.”!

Corporate Interlocks in Scotland dataset !

•  http://vlado.fmf.uni-lj.si/pub/networks/data/esna/scotland.htm!

!

Here is the Two Mode Network!(Companies & directors)!

Based upon my colors, Red=Companies; grey=indiv!

Pajek Project Files !(.paj files)!

•  a .paj file saves different network component files in one Full Pajek Project file.!

•  You can open a .paj file by going to File>Pajek Project File>Read!

•  You can also save a .paj by going to File>Pajek Project File>Save.!

Pajek Project Files (.paj files)!

Opening a .paj file: !

" " " " "!

Use the .paj file if you have as it often has more

information!!

.net Files!•  Often you will only have a .net file at

your disposal !

•  Thus, Before we do anything with the file, let’s look at what a .net file looks like. !

!•  Open a Text editor such as “wordpad”

or “notepad”!

•  Then, open the Scotland.net file from within that text editor!

 !!!•  Analyzing the first two lines:!

•  Number of Vertices: 244!

•  Vertex x,y,z coordinates (optional): 0.0000 0.0000 0.5000!

•  Note: The number of vertices listed at the top must match the number of nodes in the vertices section!!

•  Later in the file is the edge list!

•  Analyzing the first three lines:!

•  *Arcs – represents directed edges!•  *Edges – represents undirected edges!

•  Meaning: The North British Railway (node# 1) is connected to the Earl of Mansfield (node# 109), etc.!

•  Additional information after the first two numbers in *Arcs/*Edges can signify attributes of the arc/edge, such as its weight or color.!

Additional Notes about .net Files!

!•  Either *Arcs or *Edges (or both) come

immediately after the *Vertices section without hitting ‘Enter’.!

!•  If you have arcs, the *Arcs sections

always come before the *Edges section, although you do not need to include an *Arcs section if you do not have any arcs.!

 !•  .net files can be tricky … Do not use

tab, only spacing!!

Other Drop Down Menus (.clu, .vec, .per, cls, hie files)!

•  You may have noticed that the Scotland.zip website also talked about .paj, .vec, and .clu files. These are files that are created in the other menu options, which all work in a similar manner as the Networks menu option.!

 !

!

Other Drop Down Menus (.clu, .vec, .per, cls, hie files)!

 The Partitions menu:!

saves .clu files, the Vectors menu saves to .vec files, the Permutations menu saves to .per files, the Clusters menu saves to .cls files, and the Hierarchies menu saves to .hie files. !

!

Now Please Load the .Paj File for Scottish Board

Interlocks!

Now, Close Pajek and Reopen it !

Your Screen Should Look Roughly Like This !

Okay, Lets

Generate Some!Graph level Stats!

•  Now that we have seen what a .net file looks like, we can use Pajek to extract graph level data from the network.!

!•  Degree Distribution:

Net>Partitions>Degree>All!

Graph Level Stats!

•  We can take a look at the data by double clicking the drop down menu where it is listed.!

!

•  Let’s double click on “All Degree partition of N1 (244)”!

•  The Pajek window shows the node numbers, the degree of the vertices, and the name of the vertices.!

Analyzing Data in Outside Statistical Software!

•  We can also save the Partitions data as a .clu file and open it inside statistical software.!

•  Either click the floppy disk under the Partitions button, or go to File>Partition>Save!

Analyzing Data in Outside Statistical Software!

•  After you give it a title save the .clu file.!

Analyzing Data in Outside Statistical Software!

•  We can now open the .clu file in statistical software. !

•  I will use Excel.!

•  Here the degree distribution is listed. !

Analyzing Data in Outside Statistical Software!

•  The first entry shows how many vertices are in the data !

•  The second entry is the amount of connections the first node had !

•  the third entry is the amount of connections the second node had, etc.!

Average Shortest Path!

•  Net>Paths between 2vertices> Distribution of Distances> From All Vertices!

Average Shortest Path!

•  Then look at the Report window.!

Average distance among !

reachable pairs: 5.60675!

Node Level Data: Closeness Centrality!

•  We can also use Pajek to calculate node level statistics, such as various centrality measures.!

!

•  Let’s calculate Closeness Centrality.!

Closeness Centrality!

•  Net>Vector>Centrality>Closeness>All!

•  Again, we can either get the individual node data by double-clicking the drop down menu next to vectors or save the data to a .vec file.!

Closeness Centrality!

•  We can also get the average and standard deviation by going to Info>Vector. (Leave the following two windows blank that pop up before Pajek reports the data).!

Closeness Centrality!

Betweeness Centrality!

•  Let’s calculate Betweeness Centrality next.!

•  Net>Vector>Centrality>Betweeness!

Betweeness Centrality!

•  Again, we can either get the individual node data by double-clicking the drop down menu next to vectors or save the data to a .vec file.!

!

Betweeness Centrality!

•  We can also get the average and standard deviation by going to Info>Vector.!

Hubs & Authorities!

•  Net>Vector>ImportantVertices> 1-Mode: Hubs-Authorities!

Hubs & Authorities!

•  Let’s assume 10% Hubs or Authorities. !

•  Put 24 in the two windows that pop up after selecting Hubs & Authorities.!

•  Under the Vectors drop down menu there will be information for both Hubs and Authorities.!

•  You can double click or save them as .vec files.!

Hubs & Authorities!

Hubs & Authorities!

•  We can also get the average and standard deviation for both the Hub and Authority measures by going to Info>Vector.!

Pajek’s Built-in !Export Data Tool!

•  Remember: Pajek allows users to export data directly to R!

Creating a Partition!

•  Many times a network will contain natural groups that will fall into partitions (such as Dems vs GOP).!

!•  We will once again create a random

graph and then produce a random partition in order to see how Pajek visualizes partition data.!

Random Network Generation!

•  create an Erdos-Renyi random graph!

•  Net>Random Network >Erdos-Renyi> Undirected>General>…!

!

!

!

Random Network Generation!

•  >Erdos-Renyi>Undirected>General>!

!

!

!

•  How many vertices: 100!

•  Average degree of vertices: 5!

Creating a Partition!

•  Now we will create a random partition by going to Partition>Create Random Partition>1-Mode!

Creating a Random Partition!

•  Now Pajek will prompt us to set the dimension of the partition. Write in 100.!

•  After that we can set how many partitions will be in the network. Select two.!

Editing Partition Data!

•  Click on the edit button next to the Partitions drop down menu or go to File>Partition>Edit!

•  Here you can edit your partitions data. You could then save the edits to a .clu file.!

Drawing a Random Partition!

•  Go to Draw>Draw-Partition to visualize the partitions in the network.!

Network (Now with a Random Partition)!

Network !(Energized with larger node sizes)!

Partitions are Useful!

•  It is possible to use outside data to segment nodes into partitions!

•  For example, party Id (see Below) or another variable such as Race, Income, gender, etc.!

•  These partitions can be saved in the .paj File !

Partition for!

Republicans =1!

Partition for!

democrats =2!

Partition for!

Independent =0!

Federal District Court!(Roughly 90 Regional Courts)

Federal Circuit Court !(13 Regional and Specialty Courts)!

Supreme Court!

An Example of a Partition!

The American Federal Judiciary !

Lots of Uses for Partitions Including Distinguishing

Between Nodes!

Daniel Katz & Derek K. Stafford, Hustle and Flow: A Social Network analysis of the American Federal Judiciary, 71 Ohio State L. J. 457 (2010)!

Two Mode -> One Mode!!

•  It is possible to use Pajek to Convert a two mode network into a one mode networks!

•  Remember our example of 2 mode versus one mode networks!

–  two mode = Movies & Actors!

–  one mode = Actor to Actor projection of the network !

Wrap Up!

•  Pajek has a number of additional features that may be relevant in your specific empirical inquiry!

•  Consult these sources (as well as others) to learn more: !

Wrap Up!

•  Pajek is not the best tool for very large graphs or more sophisticated forms of analysis!

•  Use Igraph in Python or R !

Recommended