Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Structural and geographic properties ofonline social interactions
Yana Volkovich
Barcelona Media - Innovation Center
in collaboration withA. Kaltenbrunner, D. Laniado, C. Mascolo, and S. Scellato
Yana Volkovich (Barcelona Media) Trento, 2012 1 / 40
References
Y. Volkovich, S. Scellato, D. Laniado, C. Mascolo, and A.Kaltenbrunner;“The length of bridge ties: structural and geographicproperties of online social interactions”ICWSM-12 (International AAAI Conference on Weblogs andSocial Media)A. Kaltenbrunner, S. Scellato, Y. Volkovich, D. Laniado, D. Currie,E. J. Jutemar, and C. Mascolo;“Far from the eyes, close on the Web: impact of geographicdistance on online social interactions”;WOSN ’12 (ACM SIGCOMM Workshop on Online SocialNetworks)
Yana Volkovich (Barcelona Media) Trento, 2012 2 / 40
Introductionsocial graph
online social connections:explicit (articulated)e.g. friendship connectionsimplicit (behavioural)e.g. interactions
Yana Volkovich (Barcelona Media) Trento, 2012 3 / 40
Motivationsocial graph: nodes and edges
social graph: nodes and edgesconnections could be more informative than nodesdifferent types of social connectionsdifferent ways to characterize social connections
Yana Volkovich (Barcelona Media) Trento, 2012 4 / 40
Motivationsocial connections
different ways to characterize social connectionsinteraction strengthspatial distancestructural position in a social graph
Yana Volkovich (Barcelona Media) Trento, 2012 5 / 40
Tuenti datasetTuenti dataset
Dataset
Yana Volkovich (Barcelona Media) Trento, 2012 6 / 40
TuentiTuenti website
Tuenti is the “Spanish Facebook”a Spain-based, invitation-only social networking website
Yana Volkovich (Barcelona Media) Trento, 2012 7 / 40
TuentiTuenti website
Yana Volkovich (Barcelona Media) Trento, 2012 8 / 40
TuentiDataset
Tuenti dataset:
by Dec. 11, 2010;9.88 million registered users (anonymous profiles);more than 1 174 million friendship links;500 million messages exchanged during 3 months;
Yana Volkovich (Barcelona Media) Trento, 2012 9 / 40
TuentiDemographics: age pyramid
age pyramid
Yana Volkovich (Barcelona Media) Trento, 2012 10 / 40
TuentiDemographics: age pyramid
by gender
50.6% female;
49.4% male.
by age (average)
female: 22 years;
male: 28 years.
Tuenti users are very young45% of users are between 14 and 20 years;37.5% of users are between 21 and 30 years.1.35 more teenagers than official population (due to Tuenti signingrequirements).
Yana Volkovich (Barcelona Media) Trento, 2012 11 / 40
Social connectionsimplicit vs. explicit connections
implicit vs. explicit social connectionsDunbar’s number: an alleged theoretical cognitive limit to thenumber of people with whom one can maintain stable socialrelationshipaverage fraction of friends and the average absolute number offriends a user interacts with as a function of the number of friends
0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 10000
0.0250.05
0.0750.1
0.1250.15
0.1750.2
# friends
frac
tion
of
activ
e fr
iend
s
0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 10000
255075
100125150
# friends
# a
ctiv
e fr
iend
s
in−degreeout−degree
in−degreeout−degree
Yana Volkovich (Barcelona Media) Trento, 2012 12 / 40
Social connectionsSocial connections
Characteristics for social connections
Yana Volkovich (Barcelona Media) Trento, 2012 13 / 40
Social connectionsspatial distance, related work
social ties and spatial distances:individuals try to minimize the efforts to maintain a friendship byinteracting more with their spatial neighborsprobability of a social interaction quickly decays as an inversepower of the relative geographic distance (Stewart [1941])
Yana Volkovich (Barcelona Media) Trento, 2012 14 / 40
Social connectionsspatial distance, related work
online tools and long-distance travel might result in the ‘death ofdistance’probability of social connection between two individuals on onlinesocial networking services still decreases with their geographicdistance (Backstrom et al. [2010], Liben-Nowell et al. [2005]).
Yana Volkovich (Barcelona Media) Trento, 2012 15 / 40
Social connectionsspatial distance
spatial distancedi ,j is the geographic distance between the cities of residence ofuser i and user j ;di ,j = 0 if users report the same city of residenceaverage geographic distances between users < D > is about oneorder of magnitude larger than the average geographic distancebetween friends < l >
average geographic distance between nodes, km 531.2average link length, km 79.9
Yana Volkovich (Barcelona Media) Trento, 2012 16 / 40
Social connectionsspatial distance
spatially closer users are much more likely to engage in a socialconnection (e.g. become friends)about 50% of social links between users at a distance of 10 km orless
distance in km
% o
f frie
ndsh
ips,
inte
ract
ions
% of contacts at distance greater than x km
100
101
102
103
10
20
30
40
50
60
70
80
90
100
wall interactionsfriendshipspotential friendships
Yana Volkovich (Barcelona Media) Trento, 2012 17 / 40
Social connectionsinteraction strength
interaction strengthclose friends or just acquaintancesquantitative estimation of a how much an online connection bindstwo users together
Yana Volkovich (Barcelona Media) Trento, 2012 18 / 40
Social connectionsInteraction strength
interaction strengthwi ,j is the number of messages user i posted on the wall of user j ;wi ,j = 0 if user i has never left a message on user j ’s wall;
balanced interaction weight:
Yana Volkovich (Barcelona Media) Trento, 2012 19 / 40
Social connectionsInteraction strength (log-log)
since non-reciprocated interactions may indicate spam:the minimum of the interaction weights to emphasize reciprocatedinteractions;for the non-reciprocated interactions we only add 1/2 no matterthe difference in the numbers of messages exchanged.
100
101
102
103
10−9
10−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
balanced interaction weight
frac
tion
of c
onne
ctio
ns
distribution of the balanced interaction weight
Yana Volkovich (Barcelona Media) Trento, 2012 20 / 40
Social connectionsstructural properties
weak ties are more likely to connect together otherwise separatedportions of a network, playing an important role in informationdiffusion and resilience to network damage (Granovetter [1973])some social ties closing “structural holes” can be more powerful ormore innovative (Burt [1992])
Bakshy [2012]
100
101
102
103
10−9
10−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
balanced interaction weight
frac
tion
of c
onne
ctio
ns
distribution of the balanced interaction weight
Yana Volkovich (Barcelona Media) Trento, 2012 21 / 40
Social connectionsStructural properties:social overlap
structural properties:local position: social overlap;social overlap of an edge ei ,j as oi ,j = |Γi ∩Γj |, where Γi is the setof users connected to user i
Yana Volkovich (Barcelona Media) Trento, 2012 22 / 40
Social connectionsStructural properties:k-index of a node
structural properties:global position: k-index;k -core is the maximal subgraph in which each node is connectedto at least k other nodes of the subgraphk -index of a node is v if it belongs to the v -core but not to the(v + 1)-corek -index has been found to be an indicator of influential nodeswithin a social network (Kitsak et al. [2010])
k=1
k=3
k=2
central core/ smaller core in between/ peripheryYana Volkovich (Barcelona Media) Trento, 2012 23 / 40
Social connectionsStructural properties:k-index of an edge
k-index kij of an edge is the minimum of the k -indexes of twoendpointswe distinguish if an edge connects nodes inside a network core orlinks to a node in the periphery
0 20 40 60 80 100 120 140 160 18075
85
95
105
115
125
135
145
155
165
175180
average max k−index vs edge k−index
edge k−index
aver
age
max
k−
inde
x
Yana Volkovich (Barcelona Media) Trento, 2012 24 / 40
Combined analysisCombined analysis
Combined analysis of social connections
Yana Volkovich (Barcelona Media) Trento, 2012 25 / 40
Combined analysisCombined analysis of social connections
social connections
Yana Volkovich (Barcelona Media) Trento, 2012 26 / 40
Combined analysisSocial overlap vs. k-index
social overlap and k -index allow network scenarios where links mayhave high k -index and low overlap, or the other way round
Yana Volkovich (Barcelona Media) Trento, 2012 27 / 40
Combined analysisSocial overlap vs. k-index
social overlap ↑ ⇒ k -index grows quicklyk -index ↑ ⇒ the average social overlap grows slowlythere are inner cores where users are tightly connected to eachotherother parts of the network include more isolated users that tend tonot belong to any community
100
101
102
103
80
90
100
110
120
130
140
150
160average k−index vs. social overlap
social overlap
aver
age
k−in
dex
0 20 40 60 80 100 120 140 160 1800
20
40
60
80
100
120
140
160
180
200
220social overlap vs. k−index
k−index
soci
al o
verla
p
Yana Volkovich (Barcelona Media) Trento, 2012 28 / 40
Combined analysisDistance vs. social overlap
the geographic distance between two connected users decreasesas they share more and more friendssocial connections which span less than 60-80 km exhibit highervalues of social overlap
100
101
102
103
0
20
40
60
80
100
120
140
160
180
200
220average distance vs. social overlap
social overlap
aver
age
dist
ance
101
102
103
5
10
15
20
25
30
35
40average social overlap vs. distance
distance
aver
age
soci
al o
verla
p
Yana Volkovich (Barcelona Media) Trento, 2012 29 / 40
Combined analysisDistance vs. k -index
the average spatial length of social links decreases as theirk -index increasessocial links inside the core tend to be shorter than the onesreaching the periphery of the social network
0 20 40 60 80 100 120 140 160 1800
20
40
60
80
100
120
140
160
180average distance vs. k−index
k−index
aver
age
dist
ance
101
102
103
85
90
95
100
105
110
115
120average k−index vs. distance
distance
aver
age
k−in
dex
Yana Volkovich (Barcelona Media) Trento, 2012 30 / 40
Combined analysisDistance vs. k -index
kmax -core
Yana Volkovich (Barcelona Media) Trento, 2012 31 / 40
Combined analysisDistance vs. k -index
kmax -core
Benidorm
Almería
Ojos-Albos
Eivissa
Arenys de Mar
Barcelona
Santa Eulàlia de Ronçana
Jerez de la Frontera
Trebujena
Coruña
Granada
Errezil
Huelva
Jaén
Madrid
Ronda
Pamplona
Las Palmas de GC
Abusejo
Salamanca
Adeje
Arahal
Dos Hermanas
Lebrija
Sevilla
Cuervo de Sevilla
Valencia
Bilbao
Pego
Zaragoza
Yana Volkovich (Barcelona Media) Trento, 2012 32 / 40
Combined analysisDistance vs. interaction weight
the amount of interaction is uncorrelated to spatial distancenote that the likelihood that two individuals are connected isheavily dependent on distance
1
1.5
2
2.5
3
3.5# interactions as a function of distance
distance
# in
tera
ctio
ns
0 10 20 30 40 50 60 70 80 90 100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
2100
2200
2300
2400
2500
60
70
80
90
100distance as a function of # interactions
dist
ance
# interactions
1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80 90 100
200
300
400
500
600
700
800
900
1000
Yana Volkovich (Barcelona Media) Trento, 2012 33 / 40
Combined analysisSocial overlap vs. interaction
the impact of social overlap remains fairly constantthe interaction weight only slowly increases the social overlapgrowsthe extremely high levels of interaction mainly take place betweenusers with several shared friends, which are likely to be in thenetwork core
100
101
102
103
0
10
20
30
40
50
60
70average iteraction weight vs. social overlap
social overlap
aver
age
inte
ract
ion
wei
ght
100
101
102
103
0
50
100
150
200
250
300
350average social overlap vs. interaction weight
interaction weight
aver
age
soci
al o
verla
p
Yana Volkovich (Barcelona Media) Trento, 2012 34 / 40
Combined analysisk -index vs. interaction weight
ties in the inner cores have the highest levels of interactioninteraction weights are almost equally high for social ties with lowk -indexsocial ties with intermediate k -index, likely to bridge togetherdifferent portions of the network, experience the lowest interactionlevels
0 20 40 60 80 100 120 140 160 1801.5
2
2.5
3
3.5
4
4.5
5
5.5
6average iteraction weight vs. k−index
k−index
aver
age
inte
ract
ion
wei
ght
Yana Volkovich (Barcelona Media) Trento, 2012 35 / 40
ConclusionsConclusions
Conclusions
Yana Volkovich (Barcelona Media) Trento, 2012 36 / 40
Conclusions
social connections between users inside the core tend to haveshorter geographic spans than connections stretching outside thecoresocial ties outside the core tend to be much longer than the otherlinks: the length of these bridge ties is thus creating not onlynetwork shortcuts, but also spatial shortcutsthe amount of interactions appears independent of spatialdistanceinteraction levels appear higher inside well-connected cores andon links connecting to the fringe of the networkedges could be more informative than nodes
Yana Volkovich (Barcelona Media) Trento, 2012 37 / 40
QuestionsQuestions
Yana Volkovich (Barcelona Media) Trento, 2012 38 / 40
Bibliography I
L. Backstrom, E. Sun, and C. Marlow. Find me if you can: improvinggeographical prediction with social and spatial proximity. InProceedings of WWW 2010, Raleigh, North Carolina, USA, 2010.
E. Bakshy. Rethinking information diversity in networks, 2012. URLwww.facebook.com/notes/facebook-data-team/rethinking-information-diversity-in-networks/10150503499618859.
R. S. Burt. Structural holes: The social structure of competition.Harvard University Press, Cambridge, MA, 1992.
M. S. Granovetter. The strength of weak ties. The American Journal ofSociology, 78(6):1360–1380, 1973. doi: 10.2307/2776392.
M. Kitsak, L. K. Gallos, S. Havlin, F. Liljeros, L. Muchnik, H. E. Stanley,and H. A. Makse. Identification of influential spreaders in complexnetworks. Nature Physics, 6(11):888–893, Nov. 2010. URLhttp://dx.doi.org/10.1038/nphys1746.
Yana Volkovich (Barcelona Media) Trento, 2012 39 / 40
Bibliography II
D. Liben-Nowell, J. Novak, R. Kumar, P. Raghavan, and A. Tomkins.Geographic routing in social networks. PNAS, 102(33):11623–11628, Aug. 2005.
J. Q. Stewart. An inverse distance variation for certain socialinfluences. 93(2404):89–90, 1941.
Yana Volkovich (Barcelona Media) Trento, 2012 40 / 40