59
1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Embed Size (px)

DESCRIPTION

3

Citation preview

Page 1: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

1

Dimension matching in Facebook and LinkedIn networks

Anthony BonatoRyerson University

Toronto, Canada

ICMCE 2015

Page 2: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

I am very happy to be back in India.

நான் இந்தியா இருக்கும் மிகவும் சந்தேதாஷமாக இருக்கிதே�ன்.

Dimension matching in OSNs 2

Page 3: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Dimension matching in OSNs 3

Page 4: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

4

Friendship networks• network of on- and off-line friends form a large

web of interconnected links

Dimension matching in OSNs

Page 5: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

5

6 degrees of separation

• (Stanley Milgram, 67): famous chain letter experiment

Dimension matching in OSNs

Page 6: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

6

6 Degrees in Facebook?• 1.55 billion users• (Backstrom et al., 2012)

– 4 degrees of separation in Facebook

– when considering another person in the world, a friend of your friend knows a friend of their friend, on average

• similar results for Twitter and other OSNs

Dimension matching in OSNs

Page 7: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

7

Complex networks in the era of Big Data

• web graph, social networks, biological networks, internet networks, …

Dimension matching in OSNs

Page 8: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

What is a complex network?• no precise definition• however, there is general consensus on the

following observed properties1. large scale2. evolving over time3. power law degree distributions4. small world properties

8Dimension matching in OSNs

Page 9: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Examples of complex networks• technological/informational: web graph, router

graph, AS graph, call graph, e-mail graph

• social: on-line social networks (Facebook, Twitter, LinkedIn,…), collaboration graphs, co-actor graph

• biological networks: protein interaction networks, gene regulatory networks, food networks

9Dimension matching in OSNs

Page 10: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Properties of complex networks

1. Large scale: relative to order and size

• web graph: order > trillion– some sense infinite: number of strings entered into

Google• Facebook: > 1 billion nodes; Twitter: > 307 million

nodes– much denser (ie higher average degree) than the

web graph• protein interaction networks: order in thousands

10Dimension matching in OSNs

Page 11: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Properties of complex networks

2. Evolving: networks change over time

• web graph: billions of nodes and links appear and disappear each day

• Facebook: grew to 1 billion users – denser than the web graph

• protein interaction networks:order in the thousands

– evolves much more slowly

11Dimension matching in OSNs

Page 12: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

12

Properties of Complex Networks3. Power law degree distribution

• for a graph G of order n and i a positive integer, let Ni,n denote the number of nodes of degree i in G

• we say that G follows a power law degree distribution if for some range of i and some b > 2,

• b is called the exponent of the power law

niN bni

,

Dimension matching in OSNs

Page 13: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Power laws in OSNs

13Dimension matching in OSNs

Page 14: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Dimension matching in OSNs 14

Graph parameters

• average distance:

• clustering coefficient:

)(,

1

2),()(

GVvu

nvudGL

)(

1-1

)()( ,2

)deg(|))((| )(

GVxxcnGC

xxNExc

Page 15: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

15

Properties of Complex Networks4. Small world property

• introduced by Watts & Strogatz in 1998:– low distances

• diam(G) = O(log n)• L(G) = O(loglog n)

– higher clustering coefficient than random graph with same expected degree

Dimension matching in OSNs

Page 16: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

16

Sample data: Flickr, YouTube, LiveJournal, Orkut

• (Mislove et al,07): short average distances and high clustering coefficients

Dimension matching in OSNs

Page 17: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

17

Other properties of complex networks• many complex networks (including on-line

social networks) obey two additional laws:• Densification power law (Leskovec,

Kleinberg, Faloutsos,05): – networks are becoming more dense over

time; i.e. average degree is increasing|(E(Gt)| ≈ |V(Gt)|a

where 1 < a ≤ 2: densification exponent

Dimension matching in OSNs

Page 18: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

18

• Decreasing distances (Leskovec, Kleinberg, Faloutsos,05):

– distances (diameter and/or average distances) decrease with time

(Kumar et al,06):

Dimension matching in OSNs

Page 19: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Other properties• Connected component structure: emergence of

components; giant components

• Spectral properties: adjacency matrix and Laplacian matrices, spectral gap, eigenvalue distribution

• Small community phenomenon: most nodes belong to small communities (ie subgraphs with more internal than external links)

19Dimension matching in OSNs

Page 20: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

20

Blau space

• OSNs live in social space or Blau space: – each user identified with a point in a

multi-dimensional space– coordinates correspond to socio-

demographic variables/attributes• homophily principle: the flow of

information between users is a declining function of distance in Blau space

Dimension matching in OSNs

Page 21: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Underlying geometry

Feature space thesis: every complex network is naturally associated with an underlying feature space.

For eg:– web graph: topic space– OSNs: Blau space– PPIs: biochemical space

21Dimension matching in OSNs

Page 22: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

22

Dimensionality

• Question: What is the dimension of the Blau space of OSNs?

• what is a credible mathematical formula for the dimension of an OSN?

Dimension matching in OSNs

Page 23: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Six dimensions of separation 23

Page 24: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

24

Why model complex networks?

• uncover and explain the generative mechanisms underlying complex networks

• predict the future• nice mathematical challenges• models can uncover the hidden reality of

networks

Dimension matching in OSNs

Page 25: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Networks - Bonato 25

Page 26: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

26

“All models are wrong, but some are more useful.” – G.P.E. Box

Dimension matching in OSNs

Page 27: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

27

Random geometric graphs• n nodes are randomly

placed in the unit square

• each node has a constant sphere of influence, radius r

• nodes are joined if their Euclidean distance is at most r

• G(n,r), r = r(n)

Dimension matching in OSNs

Page 28: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Some properties of G(n,r)Theorem (Penrose,97) Let μ = nexp(-πr2n).1. If μ = o(1), then asymptotically almost surely (a.a.s.) G

is connected.2. If μ = Θ(1), then a.a.s. G has a component of order

Θ(n).3. If μ →∞, then a.a.s. G is disconnected.

• many other properties studied of G(n,r): chromatic number, clique number, Hamiltonicity, random walks, …

28Dimension matching in OSNs

Page 29: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

29

Spatially Preferred Attachment (SPA) model(Aiello, Bonato, Cooper, Janssen, Prałat,08),

(Cooper, Frieze, Prałat,12)

• volume of sphere of influence proportional to in-degree

• nodes are added and spheres of influence shrink over time

• a.a.s. leads to power laws graphs, low directed diameter, and small separators

Dimension matching in OSNs

Page 30: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

30

Ranking models(Fortunato, Flammini, Menczer,06),

(Łuczak, Prałat, 06), (Janssen, Prałat,09) • parameter: α in (0,1)• each node is ranked 1,2, …, n by some function r

– 1 is best, n is worst

• at each time-step, one new node is born, one randomly node chosen dies (and ranking is updated)

• link probability r-α

• many ranking schemes a.a.s. lead to power law graphs: random initial ranking, degree, age, etc.

Dimension matching in OSNs

Page 31: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

31

Geometric model for OSNs• we consider a geometric

model of OSNs, where– nodes are in m-

dimensional Euclidean space

– volume of spheres of influence variable: a function of ranking of nodes

Dimension matching in OSNs

Page 32: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

32

Geometric Protean (GEO-P) Model(Bonato, Janssen, Prałat, 12)

• parameters: α, β in (0,1), α+β < 1; positive integer m• nodes live in an m-dimensional hypercube• each node is ranked 1,2, …, n by some function r

– 1 is best, n is worst – we use random initial ranking

• at each time-step, one new node v is born, one randomly node chosen dies (and ranking is updated)

• each existing node u has a region of influence with volume

• add edge uv if v is in the region of influence of u nr

Dimension matching in OSNs

Page 33: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

33

Notes on GEO-P model

• models uses both geometry and ranking• number of nodes is static: fixed at n

– order of OSNs at most number of people (roughly…)

• top ranked nodes have larger regions of influence

Dimension matching in OSNs

Page 34: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

34

Simulation with 5000 nodes

Dimension matching in OSNs

Page 35: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

35

Simulation with 5000 nodes

random geometric GEO-P

Dimension matching in OSNs

Page 36: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

36

Properties of the GEO-P model (Bonato, Janssen, Prałat, 2012)

• a.a.s. the GEO-P model generates graphs with the following properties:– power law degree distribution with exponent

b = 1+1/α– average degree d = (1+o(1))n(1-α-β)/21-α

• densification– diameter D = nΘ(1/m)

• small world: constant order if m = Clog n– bad spectral expansion and high clustering coefficient

Dimension matching in OSNs

Page 37: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

37

Dimension of OSNs

• given the order of the network n and diameter D, we can calculate m

• gives formula for dimension of OSN:

Dnm

loglog

Dimension matching in OSNs

Page 38: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Logarithmic Dimension Hypothesis

In an OSN of order n and diameter D, the dimension of its Blau space is

• posed independently by (Leskovec,Kim,11), (Frieze, Tsourakakis,11)

38

Dn

loglog

Dimension matching in OSNs

Page 39: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

39

Uncovering the hidden reality• reverse engineering approach

– given network data (n, D), dimension of an OSN gives smallest number of attributes needed to identify users

• that is, given the graph structure, we can (theoretically) recover the social space

Dimension matching in OSNs

Page 40: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

40

6 Dimensions of Separation

OSN Dimension

Facebook 7

YouTube 6

Twitter 4

Flickr 4

Cyworld 7

Dimension matching in OSNs

Page 41: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

MITACS team, UBC 2012

41

L to R: Amanda Tian, David Gleich, Myughwan Kim, Me, Stephen Young, Dieter Mitsche

Dimension matching in OSNs

Page 42: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Dimension matching in OSNs 42

Page 43: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

MGEO-P(Bonato, Gleich, Mitsche, Prałat, Tian, Young,14)

• time-steps in GEO-P form a computational bottleneck

• consider a GEO-P model where we forget the history of ranks– memoryless GEO-P (MGEO-P)

• place n points u.a.r. in the hypercube • assign ranks from via a random permutation σ• for each pair i > j, ij is an edge if j is in the ball of

volume σ(i)–αn-β

43Dimension matching in OSNs

Page 44: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Contrasting the models• by considering the evolution of ranks in GEO-P, the

probability that an edge is present in GEO-P and not in MGEO-P is:

• intuitively, the models generate similar graphs

• many a.a.s properties hold in MGEO-P with similar parameters

44

)1()log( 241

22

onnO

Dimension matching in OSNs

Page 45: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

45

Properties of the MGEO-P model (BGMPTY,14)

• a.a.s. the MGEO-P model generates graphs with the following properties:– power law degree distribution with exponent

b = 1+1/α– average degree d = (1+o(1))n(1-α-β)/21-α

• densification– diameter D = nΘ(1/m)

Dimension matching in OSNs

Page 46: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

46

Proof sketch: diameter• eminent node:

– highly ranked: ranking greater than some fixed R

• partition hypercube into small hypercubes• choose size of hypercubes and R so that

– each hypercube contains at least log2n eminent nodes

– sphere of influence of each eminent node covers each hypercube and all neighbouring hypercubes

• choose eminent node in each hypercube: backbone

• show all nodes in hypercube distance at most 2 from backbone

Dimension matching in OSNs

Page 47: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Back to question…

• How would we measure the dimensionality of Blau space?

47Dimension matching in OSNs

Page 48: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Aside: machine learning• machine learning is a branch of AI

where computers make decisions and answer questions based on data sets

• examples: – spam filters– Netflix recommender systems

• especially useful when the data or number of decisions are too large for humans to process

48Dimension matching in OSNs

Page 49: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Facebook100

Dimension matching in OSNs 49

Page 50: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Validating the LDH• we tested the dimensionality of large-scale

samples from real OSN data– Facebook100 and LinkedIn (sampled over

time)• IDEA: use machine learning (SVM) to predict

dimensions– features: small subgraph counts (3- and 4-

vertex subgraphs)– compared sampled data vs simulations of

MGEO-P with dimensions 1 through 1250Dimension matching in OSNs

Page 51: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Graphlets

Dimension matching in OSNs 51

Page 52: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Experimental design

Dimension matching in OSNs 52

Page 53: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Sample: Michigan

Dimension matching in OSNs 53

Page 54: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

54

Stanford3: n: 11621 edges: 568330 avgdeg: 97.81086 plexp: 3.730000 GeoP parameters alphabeta: 0.510389 alpha: 0.366300 beta: 0.144089

python geop_dim_experiment.py --logcount -s 50 -t 0 --mmax 12 --prob 0.001 Stanford3 11621 568330 0.366300 0.144089 M-GeoP dimensions: LADTree: 2 J48: 3 Logistic: 5 SVM: 5

Dimension matching in OSNs

Page 55: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

FB and LinkedIn - SVM

Dimension matching in OSNs 55

Page 56: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

FB and LinkedIn - Eigenvalues

56Dimension matching in OSNs

Page 57: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Figure 6. For three of the Facebook networks, we show the eigenvalue histogram in red, the eigenvalue histogram from the best fit MGEO-P network in blue, and the eigenvalue histograms

for samples from the other dimensions in grey.

Bonato A, Gleich DF, Kim M, Mitsche D, et al. (2014) Dimensionality of Social Networks Using Motifs and Eigenvalues. PLoS ONE 9(9): e106052. doi:10.1371/journal.pone.0106052http://www.plosone.org/article/info:doi/10.1371/journal.pone.0106052

Page 58: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Future directions

• Other data sets

• Fractal dimension

• What are the attributes?

• What implications does LDH have for OSNs or social networks in general?

58Dimension matching in OSNs

Page 59: 1 Dimension matching in Facebook and LinkedIn networks Anthony Bonato Ryerson University Toronto, Canada ICMCE 2015

Thank you!

நன்றி!

Dimension matching in OSNs 59