1
Keep Your Friends Close and Your Facebook Friends Closer: A Multiplex Network Approach to the Analysis of Offline and Online Social Ties Desislava Hristova*, Mirco Musolesi**, Cecilia Mascolo* *University of Cambridge, **University of Birmingham Text Message Phone Call Co-location Close Friends Facebook Friends No Relationship “…more strongly tied pairs make use of more of the available media, a phenomenon I have termed media multiplexity” - Caroline Haythornwaite α 1 2 3 n 1 n 2 n 3 n 1 n 2 n 3 n 1 n 2 n 3 α 1 α 2 α 3 α 1 α 2 α 3 n 1 n 2 n 3 n 1 n 2 0 10 20 30 40 0 5 10 15 20 25 Call degree Frequency 0 10 20 30 40 80 120 160 Proximity degree Frequency 0 10 20 30 40 50 0 2 4 6 8 SMS degree Frequency Degree <k> = 2 <k> = 6 <k> = 61 union all none fb soc cf Relationship None(none) Facebook(fb) Socialize(soc) Close Friends(cf) intersection all 0.0 0.2 0.4 0.6 none fb soc cf P(X=x) mw ij = M X =1 a ij M otal number of layers in th Multiplex distance: the shortest path distance between two nodes in a network where edges are weighted by their multiplexity Strong social ties are characterized by maximal use of communication channels. Weak social ties are characterized by minimal use of communication channels. A multiplex network approach helps analyze social ties and their strength. Similarity between pairs is greater at a shorter distance in the multiplex network. Online social circles can provide greater diversity of information in politics and music. Do stronger ties utilize more communication channels? Are people who utilize more communication channels more similar? Are online social circles more diverse than offline social circles? 0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1 network distance cosine similarity 0.00 0.25 0.50 0.75 1.00 p(x|y) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1 network distance cosine similarity 0.0 0.1 0.2 0.3 0.4 p(x|y) 0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1 network distance cosine similarity 0.0 0.2 0.4 0.6 p(x|y) 0 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1 network distance cosine similarity 0.0 0.2 0.4 0.6 p(x|y) Music Preferences Political Orientation Situational Factors Health Factors Survey Data* *All data is available on the MIT Reality Commons webpage, kindly provided by the MIT Human Dynamics Laboratory Category Parameters Range avg Political Health Music Situational Interest in politics Political orientation Very - Not at all Liberal - Conservative Somewhat Interested Liberal Weight (lb) Height (in) Salads per week Fruits per day Aerobics per week (days) Sports per week (days) 81 - 330 lb 60 - 81 in 0 - 6 salads 0 - 7 fruits 0 - 7 days 0 - 6 days 158 lb 67 in 1.5 salads 2 fruits 2 days 1 day Indie/Alternative Rock Techno/Lounge/Electronic Heavy Metal/Hardcore Classic Rock Pop/Top 40 Hip-Hop/R&B Jazz Classical Country/Folk Show tunes Other No - High Interest Moderate Interest Moderate Interest Slight Interest Moderate Interest Moderate Interest Slight Interest Slight Interest Moderate Interest Slight Interest Slight Interest Slight Interest Year in college Residential sector Freshman - Graduate 1 - 8 Junior 2 -0.2 -0.1 0.0 0.1 0.2 health music political year/floor Δsim category health music political year/floor Graph Aggregations Multiplex Graphs Similarity & Distance Online Diversity Δsim = P sim off |N off | - P sim on |N on | Greater diversity is possible within online social circles with respect to the political and music categories. Take-aways Multiplex Network Approach n = 74 TM PC CL Keep Your Friends Close and Your Facebook Friends Closer: A Multiplex Network Approach to the Analysis of Offline and Online Social Ties. Desislava Hristova, Mirco Musolesi, Cecilia Mascolo. The International AAAI Conference on Weblogs and Social Media (ICWSM) 2014, Ann Arbor. Contact: [email protected] | www.cl.cam.ac.uk/~dh475/ | @desi_hristova

Keep Your Friends Close and Your Facebook Friends Closer · Keep Your Friends Close and Your Facebook Friends Closer: A Multiplex Network Approach to the Analysis of Offline and Online

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Keep Your Friends Close and Your Facebook Friends Closer · Keep Your Friends Close and Your Facebook Friends Closer: A Multiplex Network Approach to the Analysis of Offline and Online

Keep Your Friends Close and Your Facebook Friends Closer:A Multiplex Network Approach to the Analysis of Offline and Online Social Ties

Desislava Hristova*, Mirco Musolesi**, Cecilia Mascolo**University of Cambridge, **University of Birmingham

Text Message

Phone Call

Co-location

Close Friends

Facebook Friends

No Relationship

“…more strongly tied pairs make use of more of the available media,

a phenomenon I have termed media multiplexity” - Caroline

Haythornwaite

α

1

2

3

n1

n2

n3

n1

n2

n3

n1

n2

n3

α1 ∪ α2 ∪ α3

α1 ∩ α2 ∩ α3

n1

n2

n3

n1

n2

0

10

20

30

40

0 5 10 15 20 25Call degree

Frequency

0

10

20

30

40 80 120 160Proximity degree

Frequency

0

10

20

30

40

50

0 2 4 6 8SMS degree

Frequency

Degree

<k> = 2

<k> = 6

<k> = 61

intersection all proximity and calls proximity only union all

0.0

0.2

0.4

0.6

none fb soc cf none fb soc cf none fb soc cf none fb soc cf

P(X=x)

Relationship

None(none)

Facebook(fb)

Socialize(soc)

Close Friends(cf)

intersection all proximity and calls proximity only union all

0.0

0.2

0.4

0.6

none fb soc cf none fb soc cf none fb soc cf none fb soc cf

P(X=x)

Relationship

None(none)

Facebook(fb)

Socialize(soc)

Close Friends(cf)

(H1); If we consider the number of layers as an indicator oftie strength, we can describe the strength of a tie in terms ofa multiplex edge weight in the network as:

mw

ij

=MX

↵=1

a

ij

M

(4)

where M is the total number of layers in the multiplex net-work. For example, if two students utilize all possible chan-nels for communication (in our case M = 3), mw

ij

will beequal to 1, whereas if they will use one channel, mw

ij

willbe 1/3 . Next, we will apply this multiplex weight to assessthe relationship between multiplexity and user profile simi-larity in terms of the political, music, health and situationalfactors of the students.

H2: Multiplexity & Profile SimilarityIn order to assess whether multiplexity and profile similarityare related to one another, we compute the similarity scoresbetween students, using the cosine similarity of the vectorof attributes for each category - music, health, political, andsituational for each pair of students. These values are de-scribed in detail in Table 1. As an example, if two studentsare both somewhat interested in politics (value = 2), and oneis slightly liberal (value = 5), while the other is slightly con-servative (value = 3), their cosine similarity (sim = 0.98)would be higher than a pair of students with the same po-litical orientations but where one student is not at all inter-ested (value = 0) and the other is very interested (value =3) (sim = 0.7). Each category has a different number andrange of attributes, and the similarity scores vary accord-ingly, however the magnitude is consistent and allows us toperform graph correlation analysis, as described next.

To find the relationship between the multiplexity of a tie(mw

ij

), and its similarity, based on the political, health,music preferences and situational factors described in theDataset section, we use a graph correlation coefficient. It isderived from the comparison of the adjacency matrix of themultiplex graph (constructed from the multiplex weights onthe edges), and the pairwise similarity score matrix, using astandard matrix correlation coefficient function:

C

i

=

nPj2Ni

w

a

i,j

⇤ wb

i,j

snP

j2Ni

w

a

i,j

⇤nP

j2Ni

w

b

i,j

(5)

for the correlation coefficient per node with normalizedweights, and the correlation between two matrices as:

C(Aa

, A

b) =

nPi

C

i

N

(6)

where N is the number of nodes in the correspondent ma-trices.

The results of the correlations are presented in Table 3,where we can see that there is a significant positive relation-ship between the multiplex edge weight and the similarity

(a) political (b) health

(c) music (d) residence area/year

Figure 4: Spearman degree rank correlations (⇢ betweensimilarity and multiplex networks. (a) political ⇢ = 0.9 (b)health ⇢ = 0.85 (c) music ⇢ = 0.9 (d) residence area/year⇢ = 0.7

across categories. The highest correlations are with the po-litical and health factors (r = 0.6 for both), signifying thatthese are most closely related to tie strength but comparablyso with situational factors(r = 0.56), and music (r = 0.49).maybe clarify type of correlation?

Table 3: Graph correlations between multiplex weight andeach similarity score, p-value < 0.001 **, 0.01 *

political music health situational0.6** 0.49* 0.6** 0.56*

Now that we know that edges with high similarity also ex-hibit high multiplexity, we consider whether this holds trueon a network level. We measure the degree of each node interms of similarity and multiplexity and report the Spear-man degree rank correlation in Fig. 4. We observe that thosenodes with high similarity degree, also have a high multi-plexity degree. This signifies that students who are more so-cial are also more similar to their neighbours, and that pop-ular students are more similar to other students, than lesspopular ones.

We are able to confirm that the greater the multiplex-ity, the greater the similarity observed across factors (H2),both on the individual edge and neighbourhood levels. Ho-mophily effects, which are exhibited on the network level,are explored in the following section.

Multiplex distance: the shortest path distance

between two nodesin a network where edges

are weighted by their multiplexity

☞ Strong social ties are characterized by maximal

use of communication channels.

☞ Weak social ties are characterized by minimal

use of communication channels.

☞ A multiplex network approach helps analyze

social ties and their strength.

☞ Similarity between pairs is greater at a shorter

distance in the multiplex network.

☞ Online social circles can provide greater

diversity of information in politics and music.

Do stronger ties utilize more

communication channels?

Are people who utilize more

communication channels more

similar?

Are online social circles more diverse

than offline social circles?

0

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance

cosi

ne s

imila

rity

0.00

0.25

0.50

0.75

1.00p(x|y)

00.10.20.30.40.50.60.70.80.91

0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance

cosi

ne s

imila

rity

0.0

0.1

0.2

0.3

0.4p(x|y)

00.20.30.40.50.60.70.80.91

0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance

cosi

ne s

imila

rity

0.0

0.2

0.4

0.6

p(x|y)

0

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance

cosi

ne s

imila

rity

0.0

0.2

0.4

0.6

p(x|y)

Music PreferencesPolitical Orientation

Situational Factors Health Factors

Survey Data**All data is available on the MIT Reality Commons webpage, kindly provided by

the MIT Human Dynamics Laboratory

Category Parameters Range avg

Political

Health

Music

Situational

Interest in politicsPolitical orientation

Very - Not at allLiberal - Conservative

Somewhat InterestedLiberal

Weight (lb)Height (in)Salads per weekFruits per dayAerobics per week (days)Sports per week (days)

81 - 330 lb60 - 81 in0 - 6 salads0 - 7 fruits0 - 7 days0 - 6 days

158 lb67 in1.5 salads2 fruits2 days1 day

Indie/Alternative RockTechno/Lounge/ElectronicHeavy Metal/HardcoreClassic RockPop/Top 40Hip-Hop/R&BJazzClassicalCountry/FolkShow tunesOther

No - High Interest

Moderate InterestModerate InterestSlight InterestModerate InterestModerate InterestSlight InterestSlight InterestModerate InterestSlight InterestSlight InterestSlight Interest

Year in collegeResidential sector

Freshman - Graduate1 - 8

Junior2

-0.2

-0.1

0.0

0.1

0.2

health music political year/floor

Δsim

category

health

music

political

year/floor

Graph Aggregations

Multiplex Graphs

Similarity & Distance

Online Diversity

The first two figures show an overall homogeneity interms of political and health factors. At a distance of 1 (non-connected nodes), the probability of high similarity is stillhigh, indicating that two nodes with high multiplexity andhigh similarity in these categories could be connected at ran-dom. High homogeneity can be expected in the study, giventhat students are co-residing and share the same context.

On the other hand, homophily exists where there is non-random similarity between individuals with shorter multi-plex distance. This is most evident in subfigure 5c - musicpreferences, where there is a clear probability increase fromtop left to low right as distance increases. This means thatthose pairs who are closest in the network, are also highlylikely to have a similar taste in music (sim = 0.9, wheredist = 0), whereas those pairs who are further from eachother in the network have a lower similarity in music taste.For unconnected pairs, the similarity is especially low (near0), indicating that edges in the network with respect to mu-sic are non-random and highly dependent on musical prefer-ences.

Fig. 5d on the other hand shows an interesting divide be-tween low and high similarities. Most pairs are grouped intovery high or very low similarity, and appear at the top rowand bottom row of the graph. Therefore, students tend to beeither in the same year and floor or in different years anddifferent floors, which may be as a result of room allocationaccording to year. There is a high chance (P = 0.7) thatthose who live and study together (sim = 1), also have ahighly multiplex tie, represented by the distinguishable topleft tile at position (0,1), and less so for those who live fur-ther apart and/or are in a different year.

Despite the community being highly homogeneous in po-litical and health aspects, we find that the shorter the dis-tance in the multiplex network, the greater the similaritybetween two nodes (H3) in the music and situational cate-gories, indicative of homophily. The decrease of diversity ofresources such as information is a potentially harmful effectof homophily and homogeneity in a community. We there-fore conclude our analysis by exploring the role of onlineand offline social circles in diversifying information in thestudent community in the following section.

H4: Diversity in Offline & Online Social CirclesSo far, we have identified homogeneity and homophilywithin the student community. Especially during the forma-tive university years, it is important to have access to diverseinformation and opinions. In this section we assess the valueof social media such as Facebook, in introducing informa-tion diversity across the examined categories in this work bycomparing students’ offline social circles with their onlineones.

We define diversity as the opposite of homogeneity, andmeasure the � difference (change) in similarity between theonline and offline neighborhoods of each node. The onlineneighborhood of a node includes all nodes to which it isconnected where there is any declared social relationship,since we know that all relationships are a subset of Face-book (CF ⇢ SC ⇢ FB). The offline neighborhood in-cludes all other connections to nodes in the graph, where

00.20.30.40.50.60.70.80.91

0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance

cosi

ne s

imila

rity

0.0

0.2

0.4

0.6

p(x|y)

(a) political

0

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance

cosi

ne s

imila

rity

0.00

0.25

0.50

0.75

1.00p(x|y)

(b) health

00.10.20.30.40.50.60.70.80.91

0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance

cosi

ne s

imila

rity

0.0

0.1

0.2

0.3

0.4p(x|y)

(c) music

0

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance

cosi

ne s

imila

rity

0.0

0.2

0.4

0.6

p(x|y)

(d) situational (year/floor)

Figure 5: Conditional probability of similarity given a certain net-work distance (P (x|y)) in terms of multiplex weighted shortestpath. We consider the entire multiplex networks, where the maxi-mum distance was 0.9, and 1 is where no path exists between twonodes.

there is no stated social relationship in the surveys. To ob-tain the � change between the online and offline similarity,we subtract the average online similarity of an ego from theaverage offline similarity. Formally:

�sim =

Psimoff

|Noff |�

Psim

on

|Non

| (7)

where simoff and sim

on

are the similarity values betweenthe offline and online contacts of a given student, and Noffand N

on

are his offline and online neighborhoods.In Fig. 6, values above 0 indicate that there is greater on-

line diversity than offline, whereas values below zero indi-cate that there is a lower diversity online than offline. Theboxplot can be read as the distribution of �sim, where thevalues are divided into quartiles. The top and bottom contin-uous lines (or whiskers) represent the top and bottom quar-tiles, whereas the values inside the box, below and above thethick line representing the median, are the upper and lowermid-quartiles.

We can see immediately that the situational (year/floor)category online is largely less diverse than offline. In otherwords, students are more often Facebook friends with otherstudents in their year and residential sector but communicateoffline with students from other residential sectors and/oryears. Interestingly, we find that in terms of political orienta-tion, Facebook friendship between students tends to be morevaried than offline, with more than 50% of values above

Greater diversity is possible within online social circleswith respect to the political and music categories.

Take-aways

Multiplex Network Approach

n = 74TM ⊂ PC ⊂ CL

Keep Your Friends Close and Your Facebook Friends Closer: A Multiplex Network Approach to the Analysis of Offline and Online Social Ties. Desislava Hristova, Mirco Musolesi, Cecilia Mascolo. The International AAAI Conference on Weblogs and Social Media (ICWSM) 2014, Ann Arbor.

Contact: [email protected] | www.cl.cam.ac.uk/~dh475/ | @desi_hristova