20
This article was downloaded by: [155.41.121.120] On: 13 May 2014, At: 06:35 Publisher: Institute for Operations Research and the Management Sciences (INFORMS) INFORMS is located in Maryland, USA Management Science Publication details, including instructions for authors and subscription information: http://pubsonline.informs.org Tie Strength, Embeddedness, and Social Influence: A Large-Scale Networked Experiment Sinan Aral, Dylan Walker To cite this article: Sinan Aral, Dylan Walker (2014) Tie Strength, Embeddedness, and Social Influence: A Large-Scale Networked Experiment. Management Science Published online in Articles in Advance 21 Apr 2014 . http://dx.doi.org/10.1287/mnsc.2014.1936 Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions This article may be used only for the purposes of research, teaching, and/or private study. Commercial use or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher approval. For more information, contact [email protected]. The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or support of claims made of that product, publication, or service. Copyright © 2014, INFORMS Please scroll down for article—it is on subsequent pages INFORMS is the largest professional society in the world for professionals in the fields of operations research, management science, and analytics. For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org

Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

This article was downloaded by: [155.41.121.120] On: 13 May 2014, At: 06:35Publisher: Institute for Operations Research and the Management Sciences (INFORMS)INFORMS is located in Maryland, USA

Management Science

Publication details, including instructions for authors and subscription information:http://pubsonline.informs.org

Tie Strength, Embeddedness, and Social Influence: ALarge-Scale Networked ExperimentSinan Aral, Dylan Walker

To cite this article:Sinan Aral, Dylan Walker (2014) Tie Strength, Embeddedness, and Social Influence: A Large-Scale Networked Experiment.Management Science

Published online in Articles in Advance 21 Apr 2014

. http://dx.doi.org/10.1287/mnsc.2014.1936

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial useor systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisherapproval. For more information, contact [email protected].

The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitnessfor a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, orinclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, orsupport of claims made of that product, publication, or service.

Copyright © 2014, INFORMS

Please scroll down for article—it is on subsequent pages

INFORMS is the largest professional society in the world for professionals in the fields of operations research, managementscience, and analytics.For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org

Page 2: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

MANAGEMENT SCIENCEArticles in Advance, pp. 1–19ISSN 0025-1909 (print) � ISSN 1526-5501 (online) http://dx.doi.org/10.1287/mnsc.2014.1936

© 2014 INFORMS

Tie Strength, Embeddedness, and Social Influence:A Large-Scale Networked Experiment

Sinan AralMIT Sloan School of Management, Cambridge, Massachusetts 02142, [email protected]

Dylan WalkerSchool of Management, Boston University, Boston, Massachusetts 02215, [email protected]

We leverage the newly emerging business analytical capability to rapidly deploy and iterate large-scale,microlevel, in vivo randomized experiments to understand how social influence in networks impacts

consumer demand. Understanding peer influence is critical to estimating product demand and diffusion, creatingeffective viral marketing, and designing “network interventions” to promote positive social change. But severalstatistical challenges make it difficult to econometrically identify peer influence in networks. Though some recentstudies use experiments to identify influence, they have not investigated the social or structural conditions underwhich influence is strongest. By randomly manipulating messages sent by adopters of a Facebook application totheir 1.3 million peers, we identify the moderating effect of tie strength and structural embeddedness on thestrength of peer influence. We find that both embeddedness and tie strength increase influence. However, theamount of physical interaction between friends, measured by coappearance in photos, does not have an effect. Thiswork presents some of the first large-scale in vivo experimental evidence investigating the social and structuralmoderators of peer influence in networks. The methods and results could enable more effective marketingstrategies and social policy built around a new understanding of how social structure and peer influence spreadbehaviors in society.

Keywords : peer influence; social contagion; social networks; viral marketing; information systems;randomized experiment

History : Received September 17, 2012; accepted January 12, 2014, by Alok Gupta, special issue on businessanalytics. Published online in Articles in Advance.

1. IntroductionThe newly emerging capability to rapidly deploy anditerate microlevel, in vivo randomized experiments incomplex social and economic settings at populationscale is, in our view, one of the most significant innova-tions in modern business analytics and what separatestoday’s analytics from the analytical processes of thelast several decades. As more and more social interac-tions and commercial transactions are digitized andmediated by online platforms, our ability to answernuanced causal questions about the role of social behav-ior in population-level outcomes such as health, voting,political mobilization, consumer demand, informationsharing, product rating, and opinion aggregation isbecoming unprecedented. This new tool kit portends asea change in our scientific understanding of humanbehavior and dramatic improvements in business andsocial policy as a result. We leverage this new toolkit to better understand one of the most significantrecent developments in consumer behavior: how socialinfluence in networks impacts consumer demand.

Social influence in networks is recognized as a keyfactor in the propagation of ideas, behaviors, and

economic outcomes in society. Understanding the rolesocial influence plays in spreading economic behaviorsis therefore critical to constructing sound policy in boththe public and private sectors. Emerging online sys-tems, which increasingly connect people and mediatetheir interactions, now provide opportunities to acquiremicrolevel data at population scale (e.g., Eagle et al.2010, Golder and Macy 2011) and to conduct experi-ments that address endogeneity (e.g., Aral and Walker2011a, Bakshy et al. 2012a). These two advantages canbe leveraged to create new experimental analyticalmethods that yield real-time, context-specific inferencesabout the role of social influence in consumer demand,marketing, and public policy. Moreover, large-scaledata from randomized experiments permit the detec-tion of nuanced or subtle effects that are economicallyimportant but difficult to observe with observationalanalytics. It is precisely these nuanced effects that arein danger of being eclipsed by bias in endogenousprocesses, making randomized experimentation inlarge-scale social systems a vital new tool in the arsenalof modern business analytics.

1

Man

agem

ent S

cien

ce

Page 3: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social Influence2 Management Science, Articles in Advance, pp. 1–19, © 2014 INFORMS

A primary question in understanding the role ofsocial influence in the diffusion of new products, ideas,behaviors, and outcomes is how heterogeneity in therelationships between individuals impacts the influ-ence they exert on one another. Despite decades ofobservational research, results in this domain remaininconsistent and elusive. This is perhaps unsurprisingsince statistical challenges like simultaneity (Godes andMayzlin 2004), homophily (Aral et al. 2009), unobservedheterogeneity (Van den Bulte and Lilien 2001), time-varying factors (Van den Bulte and Lilien 2001), andother contextual and correlated effects (Manski 1993)make it difficult to distinguish causal peer influencefrom other confounds that create behavioral clusteringin network space and time.

Research has employed instrumental variables meth-ods (e.g., Tucker 2008) and high-dimensional propensityscore matching (HDPSM) (Aral et al. 2009) to iden-tify peer influence and distinguish it from homophilyand other confounding factors in observational data.HDPSM has recently been shown to reduce biasin causal estimates of peer influence by up to 75%(Eckles and Bakshy 2014), which is useful since mostavailable data in this domain are currently observa-tional. Nonetheless, controlling for unobservable factorslike latent homophily remains difficult (Shalizi andThomas 2011).

As an alternative to observational analysis, experi-mental network studies using random assignment canprovide a more robust means of identifying causal peereffects and distinguishing influence from confoundingfactors. Some recent experiments have demonstrateda role for peer influence in product adoption (Araland Walker 2011a, 2012; Bakshy et al. 2012b; Bapnaand Umyarov 2012), health behaviors (Centola 2010),and altruism (Leider et al. 2009). Though these studiesuse experiments to address confounds and identifypeer influence in different network contexts, they havenot investigated in vivo the social or structural con-ditions under which influence is strongest, an areaidentified as a critical new frontier in the science ofsocial influence (Aral 2011).1 Two of the most widelystudied social factors theorized to affect the strength ofsocial influence are embeddedness, the extent to whichindividuals share common peers, and tie strength, thesignificance or intensity of relationships. We there-fore investigate how embeddedness and tie strengthmoderate social influence in product adoption, whilesimultaneously controlling for confounding factors thatcan bias inference in networked settings.

1 Two notable exceptions to this are the studies by Bond et al. (2012)and Bakshy et al. (2012a), who examined the moderating role oftie strength (defined by interaction volume) on influence in votingbehaviors and advertising, respectively. In this work, we go furtherby simultaneously considering multiple different categories of tiestrength.

We conducted a randomized experiment measur-ing social influence in the adoption of a commercialapplication among 1.3 million users of the online socialnetwork Facebook.com. Using novel techniques of ran-domized experimentation in networked environmentsand statistical analysis, we identify and distinguishinfluence-driven outcomes from spontaneous outcomesand examine the roles social embeddedness and tiestrength play in the level of influence exerted betweenpeers. Theoretically, we extend the definition of tiestrength beyond the frequency of communication toexamine several precisely defined measures of thestrength of ties (SoT) that describe the nature of therelationship between individuals and their peers in aconcrete manner, including (a) the social context of therelationship (how individuals met, know one another,or interact with each other, e.g., whether peers attendedthe same college, come from the same hometown, orshare common institutional affiliations), (b) the recencyof the relationship (e.g., whether peers currently live inthe same town), (c) the overlap of common interests(e.g., being fans of the same Facebook pages or joiningthe same Facebook groups), and (d) the frequency ofphysical interaction (e.g., copresence in online photos).

Our study complements and extends prior workon the role that individual attributes play in socialinfluence processes. Prior research has demonstratedthat, in spreading processes, not all individuals arecreated equal (by analyzing how individuals’ attributesmoderate diffusion; Aral and Walker 2012). Our workextends this line of inquiry and demonstrates that notall relationships are created equal (by theorizing andanalyzing the role of relationship characteristics inmoderating diffusion). Our approach can be general-ized and extended to a multitude of systems of interestto researchers in economics, sociology, marketing, man-agement, information systems, and other quantitativesocial sciences in which large-scale randomized fieldexperiments are rapidly gaining traction as a power-ful new analytical tool. The methods and results arealso relevant to managers and policy makers applyingexperimental analytics to a variety of real businessand social policy arenas in which social influence is aprimary driver of behavior change.

2. A New Experimental Paradigm forLarge-Scale, Social Scientific,Business Analytics

The phrase “Big Data” suggests that the main powerin modern analytical processes is in the size andscale of the observational data we now collect (Mayer-Schonberger and Cukier 2013). Our view, however, isthat the real power lies in the granularity of the data(not just its scale) combined with a new ability to engi-neer and randomize social settings to (a) robustly esti-mate the causal effects of different policy alternatives,

Man

agem

ent S

cien

ce

Page 4: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social InfluenceManagement Science, Articles in Advance, pp. 1–19, © 2014 INFORMS 3

(b) explore the heterogeneity in these causal effectsacross subpopulations of consumers, and (c) unpackthe nuanced behavioral mechanisms that underlie andexplain the causal outcomes of policy experiments.These new tools portend a sea change in our under-standing of human behavior, for both business analyticsand social science. By understanding the causal behav-ioral mechanisms underlying the outcomes of specificpolicies, how and why those outcomes vary acrossdifferent consumers, and how they change over time,we can develop more contextual and personalized, andtherefore more effective, business and social policies.

Organizations have employed multivariate “A/B”testing of online products and services for several yearsnow. But recently the precision and complexity of theexperimental tool kit has increased dramatically. Asonline platforms have scaled to support hundreds ofmillions of simultaneous users and platform design hasbecome more open and precise, the ability to test com-plex dynamic hypotheses about social and behavioralphenomena has expanded. Application programminginterfaces and explicit developer controls now exposefunctional platform elements that researchers can useto execute new experimental designs. For example,Facebook enables developers to customize applica-tion features for particular users, enabling feature anddesign randomization (e.g., Aral and Walker 2011a,2012); Amazon Mechanical Turk enables the develop-ment of complex environments in which users canengage in precisely defined experimental microtasks(Mason and Watts 2012, Rand and Nowak 2011, Suriand Watts 2011); and formal collaboration with plat-form developers and website administrators enablesresearchers to achieve even more comprehensive andprecise experimental control in large-scale in vivo envi-ronments (Bakshy et al. 2012a, b; Muchnik et al. 2013).

These digital tools are enabling a new era of experi-mental social science and analytics that has begun toreveal robust evidence of the nuanced causal determi-nants of consumer behavior and the implied effective-ness of different social policies and business strategies.For example, recent large-scale digital experimentshave revealed specific insights about product adoptionand engagement (Aral and Walker 2011a, 2012; Bapnaand Umyarov 2012; Taylor et al. 2013), social commerceand advertising (Aral and Taylor 2014, Bakshy et al.2012a, Tucker 2011), information sharing and diffusion(Bakshy et al. 2012b), herd behaviors in cultural mar-kets (Muchnik et al. 2013, Salganik et al. 2006, Tuckerand Zhang 2011), health behaviors (Centola 2010, 2011),voting and political mobilization (Bond et al. 2012),performance in innovation contests (Boudreau andLakhani 2011), coordination and cooperation (Fowlerand Christakis 2010, Kearns et al. 2006, Mason andWatts 2012, Rand and Nowak 2011, Suri and Watts2011), altruism and reciprocity (Bapna et al. 2011, Leider

et al. 2009), and the growth and efficiency of two-sidedmatching markets (Tucker and Zhang 2010, Bapnaet al. 2012). As a reference, we summarize the focus,context, experimental procedures, and scale of recentlarge-scale digital experiments in Table 1. These studieshave employed complex experimental designs that ran-domize social conditions at the system (e.g., Salganiket al. 2006), category (e.g., Tucker and Zhang 2010),item (e.g., Muchnik et al. 2013), group (e.g., Suri andWatts 2011), and individual (e.g., Aral and Walker2012) levels to reveal important insights about humanbehavior at the population scale.

The scale of modern day experimentation alsoenables new levels of analysis. Smaller-scale random-ized experiments are typically only sufficiently poweredto estimate average treatment effects—the averageeffect of a policy in a population. But nuanced experi-mental tests of microlevel policies among millions ofpeople allow researchers to unpack the heterogeneityof treatment effects across different populations andto explore the behavioral mechanisms that explainthe treatment effects. For example, the experimentconducted by Aral and Walker (2012) estimates hetero-geneity in the impact of influence-mediating messageson different types of consumers; Bapna and Umyarov(2012) explore heterogeneity of influence across thedegree distribution of users; and Muchnik et al. (2013)dig deeply into whether opinion change or selectiveturnout creates the social influence bias they estimateto exist in online ratings. These insights enable busi-ness policies tailored for particular users. Furthermore,creating a deep understanding of the data-generatingprocesses that explain the average behavioral effects ofpolicy interventions prepare organizations to respondto changes in the data-generating processes and toknow how their interventions will create dynamicchanges in behavior over time.

Our work builds on a specific line of research intosocial influence in information diffusion and prod-uct adoption in networks (Aral and Walker 2011a,2012; Bakshy et al. 2012a, b; Bapna and Umyarov2012; Taylor et al. 2013; Tucker 2011). In particular,we extend an experimental method for measuringheterogeneity of treatment effects in the impact ofinfluence-mediating messages on peer behavior fromthe individual level (studied by Aral and Walker 2012)to the relationship level. As we describe below, messagerandomization can be combined with statistical analysisto measure influence and susceptibility in networks andthe heterogeneity of treatment effects across observablecharacteristics of senders and recipients. We extend thisapproach to examine structural and dyadic relationshipcharacteristics that moderate the influence someoneexerts on his or her peers. In particular, we focus onembeddedness and tie strength because these are twoof the most widely studied relationship specific factorstheorized to moderate social influence.

Man

agem

ent S

cien

ce

Page 5: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social Influence4 Management Science, Articles in Advance, pp. 1–19, © 2014 INFORMS

Tabl

e1

Exam

ples

ofLa

rge-

Scal

e,Di

gita

l,Ex

perim

enta

lSoc

ialS

cien

ceRe

sear

ch

Stud

yFo

cus

Cont

ext

Expe

rimen

talp

roce

dure

Scal

e

Prod

uctd

iffus

ion

and

enga

gem

ent

Pres

ents

tudy

Impa

ctof

tiest

reng

than

dem

bedd

edne

sson

soci

alin

fluen

ceFa

cebo

okap

pad

optio

nRa

ndom

izein

fluen

ce-m

edia

ting

mes

sage

sat

the

dyad

leve

l1.

3M

user

s

Aral

and

Wal

ker(

2011

a)Vi

ralf

eatu

res

and

soci

alin

fluen

cein

prod

ucta

dopt

ion

Face

book

app

adop

tion

and

use

Rand

omize

vira

lfea

ture

sat

the

app

leve

l1.

5M

user

sAr

alan

dW

alke

r(20

12)

Soci

alin

fluen

cean

dsu

scep

tibili

tyin

prod

ucta

dopt

ion

Face

book

app

adop

tion

Rand

omize

influ

ence

-med

iatin

gm

essa

ges

atth

ein

divi

dual

leve

l1.

3M

expe

rimen

talu

sers

,12

Mob

serv

atio

nalu

sers

Bapn

aan

dUm

yaro

v(2

012)

Soci

alin

fluen

cein

paid

prod

ucta

dopt

ion

Last

.fmpa

idsu

bscr

iptio

nad

optio

nRa

ndom

izegi

fted

paid

subs

crip

tions

atth

ein

divi

dual

leve

l40

Kex

perim

enta

luse

rs,

1.2

Mob

serv

atio

nalu

sers

Tayl

oret

al.(

2013

)Do

wns

tream

sele

ctio

nef

fect

sin

onlin

esh

arin

gbe

havi

orFa

cebo

okof

fers

Rand

omize

shar

ing

mec

hani

sms

atpl

atfo

rmle

vel

1.2

Mus

ers

Hinz

etal

.(20

11)

Seed

ing

stra

tegi

esto

prom

ote

diffu

sion

Rede

emab

leto

kens

;URL

sto

avi

ralv

ideo

Rand

omize

din

itial

seed

reci

pien

ts12

0us

ers/

28ex

perim

ents

,1,

380

stud

ents

Soci

alco

mm

erce

and

adve

rtisi

ngBa

kshy

etal

.(20

12a)

Soci

alad

verti

sing

Face

book

soci

alad

verti

sing

Rand

omize

soci

alsi

gnal

sin

adve

rtisi

ng23

Mus

ers,

148

Kad

s10

1M

user

-ad

pairs

Tuck

er(2

011)

Soci

alad

verti

sing

Unna

med

nonp

rofit

Rand

omize

soci

alad

text

and

targ

etin

g63

0ad

s,13

Kim

pres

sion

sAr

alan

dTa

ylor

(201

4)In

cent

ivize

dpe

erre

ferr

alm

arke

ting

Onlin

eflo

rald

eliv

ery

site

Rand

omize

ince

ntiv

est

ruct

ure:

Selfi

sh,g

ener

ous,

and

fair

637

user

s

Info

rmat

ion

shar

ing

and

diffu

sion

Baks

hyet

al.(

2012

b)So

cial

influ

ence

inin

form

atio

ndi

ffusi

onFa

cebo

okne

ws

feed

Rand

omize

influ

ence

-med

iatin

gm

essa

ges

atth

ein

divi

dual

leve

l25

3M

user

s,76

MUR

Ls,

1.2

Bus

er-U

RLpa

irsHe

rdbe

havi

ors

inra

tings

and

cultu

ralm

arke

tsSa

lgan

iket

al.(

2006

)He

rdbe

havi

orin

cultu

ralm

arke

tsAr

tifici

alm

usic

lab

Rand

omize

popu

larit

yin

form

atio

n14

Kus

ers

Muc

hnik

etal

.(20

13)

Soci

alin

fluen

cebi

asin

onlin

era

tings

Ratin

gsin

anun

nam

edso

cial

new

sag

greg

atio

nw

ebsi

teRa

ndom

izefir

stra

ting

atth

eco

mm

entl

evel

116

Kus

ers,

101

Kco

mm

ents

10M

user

-com

men

tim

pres

sion

pairs

Tuck

eran

dZh

ang

(201

1)Po

pula

rity

info

rmat

ion

and

choi

ceW

eddi

ngse

rvic

eve

ndor

web

site

Rand

omize

avai

labi

lity

ofpo

pula

rity

info

rmat

ion

3ve

ndor

cate

gorie

s,90

Kcl

ick-

thro

ughs

Heal

thbe

havi

ors

Cent

ola

(201

0)So

cial

influ

ence

inth

esp

read

ofhe

alth

beha

vior

sAr

tifici

alhe

alth

netw

ork

Rand

omize

alte

rs/n

etw

ork

stru

ctur

e1,

500

user

sCe

ntol

a(2

011)

Hom

ophi

lyan

din

fluen

cein

the

spre

adof

heal

thbe

havi

ors

Artifi

cial

heal

thne

twor

kRa

ndom

izeal

ters

/net

wor

kst

ruct

ure

700

user

s

Votin

gan

dpo

litic

alm

obili

zatio

nBo

ndet

al.(

2012

)Vo

ting

Face

book

vote

rreg

istra

tion

cam

paig

nRa

ndom

izeso

cial

sign

als

invo

ting

61M

user

s

Coop

erat

ion

and

coor

dina

tion

Kear

nset

al.(

2006

)Gr

aph

colo

ring

Labo

rato

ryRa

ndom

izene

twor

kto

polo

gy2

expe

rimen

ts,5

5us

ers

Fow

lera

ndCh

rista

kis

(201

0)So

cial

influ

ence

inco

oper

atio

nLa

bora

tory

expe

rimen

tsRa

ndom

partn

erin

gin

coop

erat

ion

gam

es24

0us

ers

Rand

and

Now

ak(2

011)

Coop

erat

ion

gam

esAm

azon

Mec

hani

calT

urk

labo

rato

ryRa

ndom

izefix

edan

dflu

idne

twor

kst

ruct

ure

785

user

s,40

sess

ions

Suri

and

Wat

ts(2

011)

Publ

icgo

ods

gam

eAm

azon

Mec

hani

calT

urk

labo

rato

ryRa

ndom

izene

twor

kto

polo

gy11

3ex

perim

ents

,24

user

s/ex

perim

ent

Mas

onan

dW

atts

(201

2)Co

oper

atio

nan

dex

plor

atio

nga

mes

Amaz

onM

echa

nica

lTur

kla

bora

tory

Rand

omize

netw

ork

topo

logy

256

expe

rimen

ts,

16us

ers/

expe

rimen

tRe

cipr

ocity

and

altru

ism

Leid

eret

al.(

2009

)Al

truis

m,d

irect

edal

truis

mFa

cebo

okdi

ctat

oran

dhe

lpin

gga

mes

Rand

omize

anon

ymity

and

repe

ated

inte

ract

ion

802

stud

ents

and

2,36

0st

uden

tsBa

pna

etal

.(20

11)

Reci

proc

ityga

mes

Face

book

gam

eRa

ndom

izean

onym

ity19

0us

ers,

77ga

mes

Inno

vatio

nco

ntes

tper

form

ance

Boud

reau

and

Lakh

ani(

2011

)In

nova

tion

cont

ests

NASA

TopC

oder

cont

est

Rand

omize

ince

ntiv

esan

dso

rting

onco

mpe

titiv

ere

gim

e

Two-

side

dm

arke

tsan

dm

atch

ing

Tuck

eran

dZh

ang

(201

0)Gr

owin

gtw

o-si

ded

mar

kets

B2B

exch

ange

mar

ket

Rand

omize

disp

lay

ofnu

mbe

rofb

uyer

san

d/or

selle

rs15

cate

gorie

s,3,

314

listin

gsBa

pna

etal

.(20

12)

Effe

ctof

anon

ymou

sw

eak

sign

alin

gon

date

mat

chin

gDa

ting

web

site

Rand

omize

anon

ymou

sw

eak

sign

alin

gfe

atur

e10

Kex

perim

enta

luse

rs,

100

Kob

serv

atio

nalu

sers

Man

agem

ent S

cien

ce

Page 6: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social InfluenceManagement Science, Articles in Advance, pp. 1–19, © 2014 INFORMS 5

3. Theory3.1. Social InfluenceUnderstanding how word-of-mouth (WOM) “buzz”about a product, service, opinion, or behavior canimpact its adoption has long been considered cru-cial to how firms can promote diffusion (Arndt 1967,Brown and Reingen 1987, Engel et al. 1969, Godes andMayzlin 2004, Katz and Lazarsfeld 1955, Manchandaet al. 2008). Traditionally, WOM has been specifiedas information (often about opinions, preferences, orchoices) deliberately exchanged through face-to-faceinteractions, though more recently the term has beenapplied to online or technology-enabled informationexchange between individuals or from one individualto a group of others (as in the case of consumer productreviews). Researchers in economics and marketing haveemployed the phrase “observational learning” to referto circumstances where a consumer observes (usually)aggregated decisions of a population prior to makingher selection from a set of alternatives (Banerjee 1992,Bikhchandani et al. 1998, Zhang 2010).

Yet the distinction between WOM and observationallearning is not always clear. For example, many onlinesocial networks automatically disseminate informationabout an individual’s actions to his immediate peers(e.g., “Brian is reading A Tale of Two Cities”) in away that does not indicate aggregate decisions inlarger populations or preferences among a set of clearalternatives. Social exchanges such as these straddle theboundary between WOM and observational learning.Notions of customer intent, opinion, preference, andcontent valence that are typically studied in WOMresearch may be ambiguous or subjective in this typeof information exchange, and observational learningmodels may not be well suited to situations in which theset of alternatives is very large and information aboutpeer decisions and outcomes are sparsely distributed.

As Godes et al. (2005) highlighted, the conventionaldefinition of WOM is not suitable for a variety of socialinteractions that mediate information and influence.They instead proposed the more encompassing termsocial influence to describe “an action or actions that istaken by an individual not actively engaged in sellingthe product or service and that impacts others’ expectedutility for that product or service” (Godes et al. 2005,pp. 416–417). In the remainder of this section, wedescribe and expand on the specific notions of thechannel, content, and impact of social influence proposedby Godes et al. (2005). We then discuss how thesetheoretical dimensions of social influence informed andguided the design of our randomized field experiment.

The channel of social influence refers to the mediumthrough which influence is communicated or transmit-ted. Several dimensions specify the channel, such asthe number of senders and recipients involved, which

may be one-to-one (as in the case of personal email),one-to-many (as in the cases of email lists, online rec-ommendations, group invitations, and automated peerreferrals), many-to-many (as in the case of polls inonline community forums), or many-to-one (as in thecase of voting on online forum comments). Other salientdimensions include how the recipients are selected, thecredibility of the channel, and whether the channel ismediated by a third party. The content of social influencerefers to the information that is transmitted over thechannel. For example, information can include individ-ual decisions or outcomes relating to product featuresor product adoption, factual information about productfeatures, or subjective opinions about the product as inthe case of peer recommendations or customer reviews.Salient dimensions of the content are the subjectivity(fact versus opinion) and whether the content is person-alized to the intended recipients. Finally, the impact ofsocial influence refers to the overall effect social influencemay have on the actions of others. Salient dimensionsof impact primarily relate to how it is measured, e.g.,whether the impact is inferred or measured directlyand what it means to “impact” an outcome. From ourperspective, a key dimension of impact is the causaleffect of an individual on their peers’ behavior. AsAral (2011, p. 217) argued, defining social influenceas creating behavior change or “[h]ow the behaviorsof one’s peers change the likelihood that (or extentto which) one engages in a behavior” is essential tomaking effective marketing and public policy decisionsbecause effective policy requires an understandingof how behavior is likely to change as a result of anintervention.

The specification of social influence we outline isuseful in relating existing research on differing formsof social influence to one another. Moreover, it informsthe design of our experiment. We use firm mediationto control the delivery of automated notifications withimpersonal content to randomly selected peer targetsand assess the impact of social influence by directlymeasuring peer adoption response. We discuss theramifications of these design choices in more detail inthe experimental design section.

3.2. Influence and SusceptibilityPrior research on social influence has focused on thedual notions of influence and susceptibility (or influ-enceability). The idea that some individuals are moreinfluential than others and therefore play a catalyzingrole in spreading opinions, innovations, and prod-ucts (Coleman et al. 1957, Gladwell 2002, Rogers 2003,Valente 1995, Van den Bulte and Joshi 2007) is some-times referred to as “the influentials hypothesis.” Otherresearch, focusing on the complementary idea thatindividual susceptibility to influence is the dominantdriving mechanism behind diffusion in social networks,

Man

agem

ent S

cien

ce

Page 7: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social Influence6 Management Science, Articles in Advance, pp. 1–19, © 2014 INFORMS

is represented in a variety of theoretical threshold-basedcontagion models in which behavior adoption occurswhen some number or proportion of one’s peers haveadopted beyond one’s intrinsic adoption threshold(e.g., Granovetter 1978, Valente 1996, Watts and Dodds2007).2 Though studies estimating the importance ofinfluentials and susceptibles in the diffusion of prod-ucts or behaviors in real-world networks significantlylag theoretical and simulation models of influence-based contagion, a recent observational study examinedthe combined notions of influence and susceptibil-ity in social influence processes (Iyengar et al. 2011).More recent work has empirically identified individualinfluence and susceptibility, demonstrated that bothmechanisms together determine the propagation ofbehaviors in social networks, and also explored dyadicinfluence, in which the influence exerted by an indi-vidual on their peer depends on dyadic or pairwisecharacteristics of both parties (Aral and Walker 2012).We extend this work by examining how characteristicsof the relationship between two people moderate theinfluence they exert on one another. Specifically, weestimate dyadic influence arising from heterogeneityin the embeddedness and tie strength of a relation-ship while controlling for heterogeneity in individualinfluence and susceptibility as well as for tendenciestoward noninfluenced spontaneous adoption.

3.3. Impact of Social Embeddedness andTie Strength on Social Influence

3.3.1. Embeddedness. Network embeddedness, orthe number of friends that two individuals in a rela-tionship share in common (Easley and Kleinberg 2010,p. 55), has long been theorized to affect the level oftrust, altruism, cooperation, and communication inrelationships. Embedded relationships are likely toconduct greater social influence because the presenceof third-party ties increases the level of trust betweenembedded peers (Uzzi 1997). As the relationship is“on display” in a social sense, recommendations fromembedded peers are likely to be truthful revelationsof product experiences or the perceived benefits ofthe recommended product for the party receiving therecommendation (Granovetter 1985, Uzzi 1996). Embed-dedness also engenders greater cooperation becausenews of noncooperative behavior spreads quickly inthe network, making it harder for the uncooperativeactor to maintain friendly ties with third parties. Inthis way, embeddedness enables the developmentof cooperative norms that facilitate mutual helpingrelationships (Coleman 1988, Granovetter 1985). Embed-ded relationships also typically create opportunities

2 In this context, a susceptible individual is one with a low intrinsicthreshold.

for greater knowledge transfer between individuals(Reagans and McEvily 2003) and more fine-grainedinformation flows (Uzzi 1997) that are multifaceted inthat they provide information across multiple topics ordimensions of topics (Aral and Van Alstyne 2011). Forexample, when discussing a product, two consumers inan embedded relationship may share more informationabout the product, more knowledge about differentdimensions or features of the product, and more fine-grained knowledge of the product, its uses, and itsstrengths and weaknesses compared to other similarproducts. Greater trust, cooperation, and fine-grainedinformation exchange is likely to increase the influ-ence conducted in a relationship, and thus we expectembedded relationships to convey greater influence.We adopt the conventional network structural measureof embeddedness, defined in this context as the numberof common friends shared by individuals and their peers.

3.3.2. Tie Strength. Granovetter (1973, p. 1361)defines tie strength as “a (probably linear) combinationof the amount of time, the emotional intensity, theintimacy (mutual confiding), and the reciprocal serviceswhich characterize the tie.” He notes that strong tiesalso typically display multiplexity and the exchangeof multiple types of content through the relationship(e.g., information, advice, support; Sundararajan et al.2013). Strong tie relationships are more likely to con-duct influence because they convey greater trust, morefine-grained information exchange, and cooperation(Coleman 1988). Although tie strength is a multidimen-sional theoretical construct, most studies use measuresof the frequency of interaction to proxy for the strengthof ties. For example, recent work relating social influ-ence to political mobilization concluded that strongties were associated with greater social influence, butdefined tie strength purely in terms of the frequency ofonline interactions between peers (Bond et al. 2012).Prior theory has defined tie strength in terms of anumber of aggregate measures that include frequencyof interaction, surveyed relationship category (suchas “friend,” “neighbor,” “relative,” or “acquaintance”),and perceived importance or intimacy (Brown andReingen 1987, Frenzen and Davis 1990, Granovetter1983). These extensions of the definition of tie strengthbeyond interaction frequency are clearly meaningful.At the same time, survey instruments that codify thenature of relationships between individuals and theirpeers are subject to perception bias and typically do notscale well to large systems. Aggregating the nature ofrelationship categories into a single measure is also, insome sense, undesirable because it obscures meaningfuldifferences in the type, quality, and context of relation-ships and reduces our ability to detect the impact ofthe different dimensions of tie strength on influence.

We expand this conceptualization of tie strength tocapture several different dimensions of relationships

Man

agem

ent S

cien

ce

Page 8: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social InfluenceManagement Science, Articles in Advance, pp. 1–19, © 2014 INFORMS 7

that may be relevant to the strength of social influence.We treat tie strength as a collection of well-definedmeasures about the historical and current context ofthe relationships between individuals and their peersand simultaneously examine the distinct impact ofthese different aspects of tie strength. We define thestrength of ties as including (a) the social contextof the relationship (how individuals met, know oneanother, or interact with each other, e.g., whether twoFacebook friends attended the same college, come fromthe same hometown, or the number of common institutionalaffiliations they share), (b) the recency of the relationship(e.g., whether two Facebook friends currently live inthe same town), (c) the overlap of common interests(e.g., the number of common Facebook pages they are“fans” of or the number of common Facebook groups theyhave joined), and (d) frequency of the interaction(e.g., friends’ copresence in photos online).3 We expectgreater social affiliation and interaction is predictive ofgreater influence conducted between friends, whereassimilarity in preferences and interests is predictive ofcorrelations in noninfluenced spontaneous adoption,though the theoretical distinctions between differenttypes of tie strength are not yet well developed enoughto provide clear theoretical predictions about how theywould moderate the degree of influence conducted ina given relationship.

3.3.3. The Endogeneity of Social Structure andInfluence. The review of relevant theories of tiestrength and embeddedness in the previous two sec-tions immediately calls attention to the endogenousprocesses that govern them. It is therefore perhapsunsurprising that previous empirical research onembeddedness, tie strength, and social influence hasbeen hampered by endogeneity and spurious corre-lation. In real-world networks, social embeddednessand tie strength are often correlated with each otherand with homophily, making their measurement diffi-cult to untangle in practice (Rogers 2003). Individualstend to form closer relationships with similar peers,close friends are more likely to share more friendsin common, and friendships become stronger with

3 One could interpret these variables as indicators of homophily ratherthan tie strength. However, we reserve the discussion of whetherhomophily explains variation in behaviors to the interpretationof the parameter estimates on spontaneous adoption. Althoughselection into common experiences could be driven by preferencesimilarity, such common experiences also typically strengthen ties.Our intention is to distinguish variance in correlated peer behaviorsexplained by influence from variance explained by homophily andconfounding factors. From such a perspective, common experiencethat creates influence is an example of the impact of tie strength oninfluence, whereas common experience that does not create influencebut explains correlated peer behaviors is an example of preferencesimilarities (peers with common experiences making similar choicesin the absence of influence). We leverage randomization of influence-mediating messages to make these empirical distinctions.

shared common experiences and friends. Nonetheless,tie strength, embeddedness, and homophily can beclearly distinguished theoretically. Although it maybe less common, an individual and her peer mayshare many mutual friends, despite being dissimilaron demographic or personality dimensions. Similarly,some close friends may have nonintersecting peergroups. Endogeneity among these tie characteristics isexacerbated in observational and survey-based studiesthat do not properly control for selection bias.

Natural social influence processes often involveendogenous communication patterns.4 Individualsselect into sending, receiving, or soliciting influence-mediating communications to or from their peers. As aconsequence, studies of social influence, embeddedness,and tie strength that do not explicitly control forselection biases in communication patterns confoundour understanding of how tie characteristics moderateinfluence. For example, some studies on social influenceand tie characteristics report that dissimilarity betweenpeers is correlated with increased influence (e.g., Gillyet al. 1998). In contrast, other studies contend thatmore homophilous ties, though more likely to beactivated, are not associated with any more (or less)influence (Brown and Reingen 1987). Studies have alsoexamined the role of embeddedness in social influence.For example, recent work on influence in the decisionto join Facebook in response to an email invitationindicates that less embedded ties are associated withgreater influence, leading to a higher probability ofpositive response (Ugander et al. 2012). None of thesestudies account or control for selective communication.Burt (2005) discusses another potential confoundingfactor: weak ties are more likely to transmit novelinformation, by virtue of being less socially embedded.5

These considerations highlight that inferring the impactof tie strength and social embeddedness on influence isdifficult because influence-mediating communicationsare inherently endogenous. One notable exception isprovided by the work of De Bruyn and Lilien (2008),who disentangled selection bias in communicationtendencies by explicitly modeling multiple stages ofinteraction in influence processes in an experimentalstudy of word-of-mouth viral marketing. However, thisapproach relies on the ability of researchers to correctlymodel the stages of interaction in social influenceprocesses and does not generalize well to contexts inwhich we lack intuition about which social processesare at work.

4 Communication patterns may include information seeking behavior(such as solicitation of peer advice), passive subscriptions to informa-tion sources (such as blogs, twitter feeds, or Facebook news feeds),or active forwarding of information to peers (such as personalizedreferrals or invitations).5 In the Godes et al. (2005) framework, this corresponds to selectionbias in both the channel and content of social influence.

Man

agem

ent S

cien

ce

Page 9: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social Influence8 Management Science, Articles in Advance, pp. 1–19, © 2014 INFORMS

Our study design disentangles the impact of boththe frequency of interaction and the novelty of informa-tion exchanged from other aspects of tie strength thatcharacterize the nature of the relationship between indi-viduals and their peers by (a) controlling the channel ofinfluence (through randomized recipient selection) and(b) holding message content constant. In our design,influence-mediating communication is controlled by athird party, allowing us to randomize message targetselection and to homogenize content. Messages sentfrom individuals to their peers contain approximatelythe same information, allowing us to study the impactof embeddedness and tie strength holding informationdiversity or novelty constant. Message target random-ization also ensures that the number (frequency) ofinfluence-mediating messages sent from individuals totheir peers is independent of embeddedness and tiestrength.

Communication patterns between individuals andtheir peers certainly play a role in influence and in partcomprise what makes individuals influential on theirpeers. However, understanding the impact of socialembeddedness and tie strength on influence, holdingcommunication patterns constant is important for tworeasons. First, it contributes to our understanding ofhow recipients of an influence-mediating messagewould respond differently to more or less embeddedpeers and peers with whom they are strongly or weaklyconnected. Such insights are critical to viral marketinginitiatives designed to target advertisements at thosemost likely to maximize the diffusion products and ser-vices in a population through their natural or intrinsicinfluence (Aral et al. 2009, 2013). Second, it can informpolicies that operate outside of the scope of naturalinfluence (such as individual and network-based inter-ventions and peer-oriented incentive schemes), whichare deliberately designed to impact and alter naturalcommunication and information flow patterns (Araland Taylor 2014).

4. Empirical MethodsWe partnered with a firm that develops commercialapplications hosted on the popular social networkingwebsite Facebook.com. A commercial Facebook appli-cation was designed and publically released in concertwith the launch of the experiment. This application pro-vides users the opportunity to share information andopinions about movies, actors, directors, and the filmindustry in general. As adopters used the application,automated notifications were delivered to randomlyselected peers in their local social networks. Data onindividual attributes of adopters and their local peers,time-stamped delivery of automated notifications, andsubsequent time-stamped adoption responses werecollected throughout the course of the experiment.

The experimental design employs message targetrandomization to deliver automated notifications6 torandomly selected peers of existing application users.In this scheme, packets of notifications are generatedwhen application users take one of several actionswithin the Facebook application (e.g., when the userrates a movie or friends a celebrity). These notificationsare then delivered to randomly chosen subsets ofthe application user’s Facebook friends. The randomselection of a set of recipient peers is performed on aper-packet basis (i.e., a different set of recipient peersis randomly chosen each time a packet is sent from anapplication user). This design is illustrated in Figure 1,which displays a diagram of the delivery of two packetsover sequential time periods.

At time t1, the application user performs a packet-generating action within the application and a packet ofnotifications is generated. At time t2, randomly chosenpeers of the application user are designated as recipientsand receive the notifications. At time t3, the applicationuser performs a second packet-generating action withinthe application, and a second packet of notifications isgenerated and delivered at time t4 to a (different) setof randomly chosen peer recipients. At any given timethroughout the course of the experiment, peers of anapplication user received 0, 1, 2, or more influence-mediating messages from their application user friend.The exposure of a peer to influence-mediating messages(notifications) over time is exogenously determined asa function of the randomization procedure. Over time,peers are assigned to risk groups (corresponding tothe number of influence-mediating messages received),where risk is monotonically and randomly increasingover the course of the experiment. We discuss theimplication of message target randomization on ourmodeling strategy and censoring procedures in thesection that follows.

Automated notifications (passive viral messages)have several advantages over alternate types ofinfluence-mediating communication for the purpose ofour experimental design (Aral and Walker 2011a, b,2012). First, the automated notifications channel ismediated by a third party, allowing the desired level ofexperimental control and randomization of peer targets.Since the recipients of notifications and the decisionsof whether and when to send them are all automated,selection effects on the part of the sender that mightotherwise introduce bias can be avoided.

Second, the content of automated notificationsemployed in the experiment included only impersonalinformation about the sender’s use of the application

6 Notifications are a form of messaging on Facebook that appear in adesignated Notifications inbox on the recipient’s Facebook menu andare not integrated into other areas, such as Facebook news feeds,and as such are not subject to ranking or filtering mechanisms.

Man

agem

ent S

cien

ce

Page 10: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social InfluenceManagement Science, Articles in Advance, pp. 1–19, © 2014 INFORMS 9

Figure 1 Randomized Targeting of Influence-Mediating Messages

Notes. A diagram depicting the message target randomization employed in the experiment is shown. Notification packets are generated when an application usertakes a packet-generating action within the Facebook application. For each packet that is generated, the notifications in the packet are distributed to a randomlyselected subset of the application user’s peers. The figure displays two sequential packet distributions. Different recipient targets are randomly chosen at the timeof distribution for each packet.

(as is typical of information exchange in observationallearning scenarios). The inclusion of only impersonalinformation in influence-mediating messages allowsfor the measurement of the impact of social influencewhile holding constant the potentially large degree ofheterogeneity present in personalized, sender-createdcontent. Heterogeneity in the content of messagescreated by individuals is known to have a significanteffect on social influence. In particular, the effects ofcontent heterogeneity on influence have been studiedin the context of the positive valence of the messagecontent (Berger and Milkman 2009) and the effective-ness of viral features that allow message tailoring orpersonalization (Aral and Walker 2011a). Althoughcontent heterogeneity can and in all likelihood doesplay a major role in what makes individuals influential,simultaneous variation of content and relationshipattributes can confound measurements of the effect ofrelationship attributes on social influence.

Third, the delivery of notifications to only a ran-dom subset of an individual’s peers permits directcomparison of the responses of treated peers to thoseof peers of the same application user that were nottreated. When the targets of potentially influential com-munications are randomized among peers of the sameapplication user, any homophilous structure betweenan application user and his treated and untreated peersand the propensity to select a particular peer to notifyare held constant and are identical for recipient andnonrecipient peer groups. Other unobserved factorsthat could potentially drive influenced adoption, suchas off line or alternative online communications, canalso be cleanly distinguished with this design, becauserecipient and nonrecipient peers in expectation sharesimilar propensities to receive and be affected by suchcommunications on average. Moreover, homophily inunobserved attributes (latent homophily) that may beindicated by the very existence of a relationship of

peers with a common friend (see Shalizi and Thomas2011) will be equally represented in recipient and nonre-cipient peer groups. Differences in adoption outcomesbetween recipient and nonrecipient peers can then beattributed solely to the influence-mediating messagesthey received.

5. Analysis and Results5.1. Data and Descriptive StatisticsThroughout the 44-day experimental period, beginningon November 28, 2009, we collected individual-levelprofile data from 7,730 application users and their1.3 million distinct peers as well as time-stamped clickstream data on notification delivery and subsequentpeer responses. During this time, 41,686 automatednotifications were delivered to randomly chosen peertargets of application users, resulting in 967 peer adop-tions, a 13% increase in product adoption. The user datawe collected included the social network of adopters, allmutual ties between peers, and individual-level profiledata including age, gender, relationship status, hometown,current town, college attendance, affiliations, Facebook pages,Facebook group membership, and tagged appearance inphotos.7 Descriptive statistics are provided in Tables A1and A2 of the online technical appendix (available athttp://web.mit.edu/sinana/www/MSTA14.pdf).

5.2. Model SpecificationFollowing prior work on influence identification andsocial contagion, we adopt a hazard modeling approach

7 Facebook affiliations typically indicate some past experience suchas working for a company or belonging to the same institutionor society. Facebook pages typically indicate interest in activities,brands, products, bands, and media personalities. Facebook groupsare an online version of social groups that allow users to interactwith one another in centralized discussion forums and view newsand events pertaining to that group.

Man

agem

ent S

cien

ce

Page 11: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social Influence10 Management Science, Articles in Advance, pp. 1–19, © 2014 INFORMS

(Aral and Walker 2011a, 2012; Iyengar et al. 2011; Namet al. 2010; Van den Bulte and Lilien 2001). We employ aCox proportional hazard model to estimate the hazardof a peer (of an existing application user) adopting. Themodel simultaneously estimates the impact of socialembeddedness and multiple measures of tie strength oninfluenced and spontaneous adoption while controllingfor the moderating effect of individual attributes oninfluenced and spontaneous adoption:

�j4i1j1t5

=�04t5exp{

�NNj4t5+�iSpontXi+�

jSpontXj +�SoT

Pref SoTij

+�EmbedPref Embedij +�i

InflXiNj4t5+�jSuscXjNj4t5

+�SoTInfl Nj4t5SoTij +�Embed

Infl Nj4t5Embedij

}

1

where �j is the hazard for a peer j of a user i adopt-ing, Nj4t5 is the number of notifications received by apeer j , Embedij is the embeddedness of the relationshipbetween user i and peer j (the number of friends sharedby individual i and peer j), and SoTij is a vector of thetie strength attributes characterizing the relationshipbetween individual i and peer j . These models includea rich set of covariate controls for demographic andindividual-level characteristics of individuals 4Xi5 andtheir peers 4Xj5 (including age, gender, and relationshipstatus). The coefficient �N captures the raw impact ofinfluence, holding constant heterogeneity in individualor tie attributes—it represents the marginal impact ofa peer receiving influence-mediating messages irre-spective of the individual attributes of individual iand peer j or their dyadic tie attributes. The coefficient�i

Spont captures the tendency of peers of applicationusers with attributes Xi to spontaneously adopt inthe absence of influence (Nj = 0). The coefficient �j

Spontcaptures the tendency of peers with own attributesXj to spontaneously adopt in the absence of influence(Nj = 0). The coefficient �SoT

Pref captures the extent towhich the measure of tie strength indicates a similarityin preference for a peer adopting the product spon-taneously in the absence of influence (Nj = 0) giventhat her application user friend has adopted. This isthe variation in correlated peer behaviors explained byhomophily along tie strength features. The coefficientSimilarly, �Embed

Pref captures the extent to which peers withembedded ties to existing application users will havea preference to adopt the product spontaneously inthe absence of influence (Nj = 0). This is the variationin correlated peer behaviors explained by homophilyalong the embeddedness features. The coefficient �i

Inflrepresents individual influence—it captures the impactof individual attributes of an application user (Xi) onthe hazard of her peer adopting due to influence perinfluence-mediating message received. The coefficient�jSusc represents individual susceptibility to influence—it

captures the impact of peer attributes (Xj ) on peers’ haz-

ard to adopt due to influence per influence-mediatingmessage received. The coefficient �SoT

Infl captures theimpact of the tie strength measure (for the tie betweenapplication user i and peer j) on influence-driven adop-tion per influence-mediating message received. Thecoefficient �Embed

Infl captures the impact of the embedded-ness of the tie between application user i and peer j oninfluence-driven adoption by j per influence-mediatingmessage received. The model estimation provides goodconcordance with observed data, and Wald, log rank,and likelihood ratio test statistics indicate strong likeli-hood, significance, and goodness of fit. (See Table A3 inthe online technical appendix.)

Because peers of application users randomly receive(multiple) notifications from their application userfriends throughout the course of the experiment, weemployed interval censoring to transition users fromone risk group (e.g., the hazard associated with receiv-ing one influence-mediating notification) to the next(e.g., the hazard associated with receiving two influence-mediating notifications). Peer adoption outcomes maybe correlated because peers of a given application usershare a common application user friend. To account forthis, the hazard model employs robust errors clusteredon the identity of the application user friend.

5.3. ResultsOur model specification enables us to estimate twoquantities of theoretical interest: influence-based adop-tion and spontaneous adoption. Influence-basedadoption measures the degree to which a particularrelationship characteristic moderates the influence anindividual has over their peers or the degree to whichthat characteristic is associated with changing some-one’s behavior from not adopting to adopting theapplication. Spontaneous adoption, on the other hand,measures the degree to which a particular relationshipcharacteristic predicts a correlated latent preference toadopt the product. For example, if having attended thesame college is associated with spontaneous adoptionbut not influence-based adoption, then attending thesame college is capturing preference similarities thatpredict adoption by a peer of a current adopter. If,on the other hand, having attended the same collegeis associated with influence-based adoption but notspontaneous adoption, then individuals influence thosefriends who attended the same college as they didmore than those friends who attended different collegesthan they did.

These distinctions are critical to marketing policy andnetworked interventions in general. Predictors of spon-taneous adoption can inform targeting: They generategood sets of advertising targets whose preference simi-larities to current adopters make them likely to respondpositively to advertisements about the product underconsideration. On the other hand, moderators of influ-ence can inform peer referral strategies: they highlight

Man

agem

ent S

cien

ce

Page 12: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social InfluenceManagement Science, Articles in Advance, pp. 1–19, © 2014 INFORMS 11

good sets of relationship pairs in which incentives topropagate social influence may work well to createadditional product adoptions (Aral and Taylor 2014).For example, incentives to “invite friends” to productsor discounts for friend and family adoption may workwell in relationship pairs that conduct influence. Modelestimations of the impact of relationship characteris-tics on influence-based and spontaneous adoption aretabulated in Table 2. (For full model estimations, seeTable A3 in the online technical appendix.)

Table 2 Impact of Embeddedness and SoT on Influence

Influence Spontaneous

Hazard ratio (SE) Hazard ratio (SE)

�EmbedInfl �Embed

Pref

Num. common friends 100063∗∗∗ 100077∗∗∗

40000165 40000085

�SoTInfl �SoT

Pref

Hometown (same) 100171 10472440022725 40025685

Hometown (different) 103735∗ 00718740016845 40023615

Current town (same) 202899∗∗∗ 00468640030945 40072215

Current town (different) 003171∗ 109300∗

40066995 40034185College (same) 805540∗∗ 003646

40083895 41012725College (different) 005878 007664

40049565 40035295Num. common affiliations 202548∗∗ 008184

40037405 40032885Num. common pages 100031 100067∗∗∗

40000235 40000105Num. common groups 100074 100335∗∗∗

40000575 40000445Num. photos together 009977 100142∗∗∗

40000315 40000185

Notes. This table reports parameter estimates and standard errors fromthe single failure proportional hazards model. Variables reported includecategorical dummy variables indicating the following: Hometown (same,different), whether the individual and peer come from the same or differenthometowns (unreported hometown corresponds to the holdout); Currenttown (same, different), whether the individual and peer live in the sameor different current towns (unreported current town corresponds to theholdout); College (same, different), whether the individual and peer attendedthe same or different colleges (unreported college corresponds to the holdout);Num. common pages, the number of Facebook pages shared in commonbetween the individual and their peer; Num. common groups, the number ofFacebook groups shared in common between the individual and their peer;Num. photos together, the number of photos in which both the individual andpeer appear. Hazard ratios in the influence column correspond to variablescrossed with Nj (the number of notifications received by the peer) and indicatethe effect of attribute-driven adoption. Hazard ratios in the spontaneouscolumn correspond to uncrossed variables and represent spontaneous orpreference-related adoption.

∗p < 0010; ∗∗p < 0005; ∗∗∗p < 0001.

5.3.1. Effects of Social Embeddedness and TieStrength on Influence. Results illustrating the impactof social embeddedness and tie strength on influenceare displayed in Figure 2. The forest plot displaysthe hazard ratios, standard errors (boxes), and 95%confidence intervals (whiskers) of influence-drivenadoption associated with embeddedness (number ofcommon friends) and tie strength attributes. The haz-ard ratios displayed are relative to the baseline case(a hypothetical blank social network profile) and all cat-egorical dummy variables such as Same college and Diff.college catalogue all categories for which the measure isdefined. The holdout sets are cases for which either theapplication user or peer has not reported the measurein their social network profile. For example, the threeexhaustive categories for college affiliation relationsand their associated encoding are (1) the peers wentto the same college (Same college = 1; Diff. college = 0),(2) the peers went to different colleges (Same college = 0;Diff. college = 1), and (3) one or both peers do notreport their college attendance (Same college = 0; Diff.college = 0).

We observe several interesting patterns in the resultsdetailing the impact of embeddedness and differentmeasures of tie strength on influence. First, tie mea-sures that capture peers’ joint participation in commonsocial or institutional contexts between individualsand their Facebook friends are associated with greaterinfluence. Individuals exert 125% more influence onfriends for each institutional affiliation they share incommon (p < 0005). Attending the same college as one’sfriend is associated with a 1,355% increase in influence(p < 0001) compared to attending different colleges.This represents the largest impact on influence of thecategorical measures of tie strength we considered. Incontrast, coming from the same hometown is not signif-icantly associated with influence, perhaps suggestingthat this measure accurately captures more casual oreven incidental social contexts (e.g., weak ties resultingfrom Facebook users’ desires to keep in contact withcasual acquaintances).

Second, tie strength measures associated with currentor recent social contexts exhibit differing impacts oninfluence. Individuals exhibit 622% more influence onfriends that live in the same current town (p < 0001).This is interesting because ties between friends currentlyliving in the same town may indicate joint involvementin more recent social contexts (e.g., friendships thatare more recent or recently relevant). Interestingly,appearing in photos with peers, an indicator of offlineinteraction at in-person events, is not significantlyassociated with influence.

Third, tie strength measures associated with commoninterests or preferences do not moderate influence.Individuals are no more or less influential on peers

Man

agem

ent S

cien

ce

Page 13: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social Influence12 Management Science, Articles in Advance, pp. 1–19, © 2014 INFORMS

Figure 2 Influence Associated with Embeddedness and Tie Strength

. ent

ent

.

.

.

.

.

.

.

Notes. The effects of tie strength and embeddedness and tie strength measures on influenced adoption as shown. The figure displays hazard ratios (HRs)representing the percentage increase (HR > 1) or decrease (HR < 1) in adoption hazards associated with each tie attribute. Boxes are standard errors. Whiskers are95% confidence intervals.

with whom they share common Facebook pages or arecomembers of online groups.

Finally, individuals are more influential on peerswith whom they are more embedded, exhibiting a(multiplicative) 0.6% increase in influence for eachfriend they share in common (p < 0�001). The impactof embeddedness on influence, though comparativelysmaller than the tie strength measures considered here,remains economically significant, as the number ofcommon friends can be quite large. For example, for afriendship that shares 10 friends in common, the resultimplies over a 6.5% increase in influence.8

These results indicate the importance of social embed-dedness in influence processes and the subtlety inthe relationship between influence and measures oftie strength. They also highlight the importance ofusing disaggregated measures of tie strength. It shouldbe noted that although our experimental design andanalysis allow for identification of the causal impactof peer influence, we estimate how relationship-levelcovariates are correlated with the extent or impact ofinfluence. This is an important distinction from a policystandpoint, because our results do not indicate that

8 Using the exact estimate of the influence associated with commonfriends of 1.0063 (taken from Table 2), the example of 10 friends incommon leads to an increase in influence of (1.0063)10 = 1�065, or a6.5% increase in influence.

exogenous changes to relationship-level covariateswill yield a change in influence or influenceabilitycorresponding to our estimates. For example, our find-ings do not indicate that exogenously inducing anindividual to move to the same current town as apeer will lead to a sixfold increase in influence overthat peer. Our findings are robust to the inclusion ofmultiple different factors that control for heterogeneityin network degree, Facebook profile completeness,Facebook activity levels, nonlinearity in the response toinfluence-mediating messages, and stratification overthe number of influence-mediating messages sent byapplication users. (See the online technical appendixfor more details on the tests of model robustness.)

5.3.2. Social Embeddedness and Tie Strength asPredictors of Spontaneous Adoption. Results on thecorrelation between spontaneous (preference) adoptionand tie characteristics are displayed in Figure 3.

Tie characteristics (tie strength and embeddedness)associated with spontaneous adoption of the product bypeers of existing adopters indicate preference similaritybetween peers—the extent to which the measure cap-tures similarity in the (latent) preference to adopt theproduct when a friend has already adopted. Some tiestrength measures that seem to relate to common socialcontexts are good predictors of preference similarity,whereas others are not. Sharing common affiliations or

Man

agem

ent S

cien

ce

Page 14: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social InfluenceManagement Science, Articles in Advance, pp. 1–19, © 2014 INFORMS 13

Figure 3 Spontaneous (Preference) Adoption Associated with Embeddedness and Tie Strength

ent

ent

.

.

. .

.

.

.

.

Notes. The effects of tie strength and embeddedness and tie strength measures on spontaneous (preference) adoption are shown. The figure displays hazard ratios(HRs) representing the percentage increase (HR > 1) or decrease (HR < 1) in spontaneous adoption hazards associated with each tie attribute. Boxes are standarderrors. Whiskers are 95% confidence intervals.

attending the same college as an adopter of the prod-uct is not significantly associated with a tendency tospontaneously adopt. However, each photo that a peershares in common with an adopter is associated with a1.4% increase in the hazard to adopt spontaneously(p < 0�001). Coming from the same hometown as apeer who has adopted the product is associated with a105% increase in the hazard to adopt spontaneously(p < 0�05). This indicates that hometown may be a good(latent) proxy for individual preferences for the product.One explanation for this pattern in the results could bethat current friends influence us more, but that ourpreference-driven behaviors are more correlated withpast, nonrecent social contexts. In other words, we aremore influenced by friends in the same current town,but our preferences are more correlated with friendsfrom the hometown in which we grew up and withfriends that currently do not live in the same town.

Each additional fan page that a peer shares in com-mon with an adopter of the product is associated witha 0.7% increase in the hazard to adopt spontaneously(p < 0�001). This indicates that declared preferencesand interests (not directly related to the product) alsocapture preferences for the product. Each online group

that a peer participates in with an adopter of the prod-uct is associated with a 3.4% increase in the hazard tospontaneously adopt (P < 0�001). This indicates thatonline social activities capture latent dimensions ofpreference for the product. We note that the aboveinterpretation of spontaneous adoption estimates aresubject to the caveat that influence through offlinechannels unrelated to the explicit channel of auto-mated notifications that we focus on here may exist,though prior research suggests such effects are lowbecause active offline channels are used much lessoften because of the increased effort required comparedto automated (passive) messaging (Aral and Walker2011a). In addition, researchers should use cautionin interpreting how tie attributes moderate influence,because some unobserved individual attributes may becorrelated with the tendency to form particular typesof ties. For example, the average number of commonphotos that an individual appears in with their peersmay be correlated with extroversion.

6. RobustnessThough our estimates are based on a large-sample invivo randomized experiment, which provides a certain

Man

agem

ent S

cien

ce

Page 15: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social Influence14 Management Science, Articles in Advance, pp. 1–19, © 2014 INFORMS

degree of assurance against omitted variables andquestions about causal interpretations of our results,we conducted multiple tests of the robustness of ourparameter estimates to ensure the reliability of ourfindings.

To assess possible multicollinearity among modelcovariates, variance inflation factors were calculated forall time-independent covariates in the model. Resultsare presented in Table A4 in the online technicalappendix. All variance inflation factors are well belowthe conventionally accepted threshold (VIF < 5), indi-cating the multicollinearity is not an issue in our modeland does not significantly impact our findings.

To assess to the extent to which our estimates of themoderating impact of tie strength on influence dependupon embeddedness, the model was estimated with ouroperational definition of embeddedness (the numberof common friends) omitted from the model. Theresults, presented in Table A5 in the online technicalappendix, indicate that our findings on the moderatingimpact of tie strength on influence generally holdwhen embeddedness (number of common friends) isexcluded from the model.

To assess whether our main findings were affected byvariation in the network degree of the adopting user i,we estimated two models, one with an additional controlfor network degree, shown in Table A6a in the onlinetechnical appendix, and one with both the standalonedegree variable (network degree of i5 and the interactionterm (number of notifications ∗ network degree of i5,shown in Table A6b in the online technical appendix. Wefind that the network degree of user i is not significantand does not significantly alter the remaining estimates,indicating that our findings are robust to variation innetwork degree of the adopting user.

We find the standalone control for network degreeof user i is only marginally significant, whereas theinteraction control for network degree of user i andthe number of notifications is significant but similarin magnitude and opposite in sign to that of the stan-dalone control term, indicating that the impact of thetwo are both small and tend to cancel one anotherin cases when influence is present. Importantly, theinclusion of both standalone and interaction terms withnetwork degree does not significantly alter the remain-ing estimates, indicating that our findings are robust tovariation in the network degree of the adopting user.

Variation in profile completeness within Facebookcan be extensive in terms of the number of photos,groups, pages, and affiliations indicated by individ-uals and their peers. To establish that our results onthe moderating impact of tie strength representedby common photos, groups, pages, and affiliationsare robust to controls for profile completeness, weestimated two models, one with standalone profilecompleteness controls, shown in Table A7a in the online

technical appendix, and one with both standalone(profile completeness term) and interaction (number ofnotifications ∗ profile completeness) controls, shown inTable A7b in the online technical appendix. We findthat our main findings are generally robust to inclusionof controls for both standalone and interacted profilecompleteness for individuals and their peers.

The message target randomization scheme weemployed delivers automated notifications from anindividual to randomly selected subsets of their peers.Since the generation of notifications is contingent onuser activity within the application (according to limi-tations placed on Facebook application messaging tolimit spam) and because this activity may be correlatedwith other (potentially latent) attributes, we assess therobustness of our findings by estimating a model witha control for the number of messages sent in Table A8in the online technical appendix.

Our findings are generally robust to the inclusion ofan additional control for the number of notificationssent. The reduction in the primary influence effect(number of notifications received) is expected and is aconsequence of the parceling out of the main influenceeffect across the actual number of notifications receivedby a peer and the average probability to receive anotification for a peer j of a user i, as embodied bythe number of notifications sent by user i (which isrelated to the average probability to receive notificationsthrough a constant of proportionality as the inverse ofthe degree of user i).

However, because the control for number of noti-fications sent is significant in the above estimation,we performed two additional robustness checks byestimating stratified models with stratification overterms related to the number of notifications sent. Thesespecifications test whether our findings are a resultof correlation between the number of notificationssent and some latent variable (X) that itself is bothcorrelated with either interest in the application or atendency to spontaneously adopt in the absence ofinfluence and is homophilously distributed amongapplication users and their peers.

A model stratified by whether the average noti-fications sent by a user are greater than average(nns_greater_than_average) with a control for the num-ber of notifications sent is displayed in Table A9ain the online technical appendix. The model showsthat influence and the modulation of influence by tieattributes remain strong and highly significant, andour estimations do not change substantially relative toour original model, indicating little benefit to stratifica-tion. Moreover, the number of notifications sent is nolonger significant after stratification, indicating thatthe residual impact of the number of notifications sentcannot be attributed to a latent variable correlated with

Man

agem

ent S

cien

ce

Page 16: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social InfluenceManagement Science, Articles in Advance, pp. 1–19, © 2014 INFORMS 15

an interest or tendency to adopt spontaneously in theabsence of influence.

The conclusion is similarly supported by estimationof a second model stratified by the number of notifica-tion sent, displayed in Table A9b in the online technicalappendix. This second model shows that influence andthe modulation of influence by tie attributes remainstrong and highly significant, and our estimationsdo not change substantially relative to our originalmodel, indicating little benefit to stratification. Bothstratified models have a substantially worse fit to theempirical data (in terms of measures of likelihood andconcordance), are less parsimonious, allow for a largeamount of unspecified heterogeneity (arising fromthe nonspecification of the large number of baselinehazards for each strata), and hinder interpretation ofcovariate estimates, relative to the main model pre-sented in this paper. We therefore retain these modelsas secondary robustness checks.

The moderating impact of tie strength and embedded-ness (the primary focus of this work) is best assessedthrough the main model presented in the results sectionof this paper. In this model, the impact of receiving oneor more notifications is modeled with a response term(in the proportional hazard model) that is linear in thenumber of notifications received. However, in theorywe expect the marginal response of a peer receivingan additional notification to decay nonlinearly as thenumber of notifications increases beyond a threshold.

Martingale residuals, which assess the extent towhich the number of notifications received departsfrom linearity, are displayed in Figure A3 in the onlinetechnical appendix. The martingale residuals plot indi-cates relatively linear response (approximately zeroslope), with a minor departure from linearity thatbegins for peers receiving four or more notifications.Peers receiving four or more notifications are relativelyinfrequent in our data (only 40 out of 1.3 million peersreceived five or more notifications; less than 300 peersreceived four or more notifications; less than 1,000 peersreceived more than two notifications). Nonetheless,we assess the departure from linearity of the maininfluence effect by estimating our original model witha term that is quadratic in the number of notificationsreceived in Table A10 in the online technical appendix.

We do find significance for the quadratic term innumber of notifications received, consistent with thereasoning outlined above. It should be noted thatinclusion of a term that is quadratic in number of noti-fications received makes interpretation of the estimatesand confidence intervals of moderating tie strength andembeddedness factors less clear because quadratic inter-action terms need also be included and their estimatescombined to correctly assess moderating factor effectsizes, significance, and confidence intervals, making

interpretation and comparison of the impact of embed-dedness and different tie strength measures difficult.In this sense our original model is more suitable forassessing the moderating impact of tie strength andembeddedness on influence.

Finally, we minimize the effects of interference inour experiment by recruiting subjects from a largesparse graph structure, implementing a recruitmentcampaign that minimizes the connections betweenrecruited subjects, analyzing only local peer effects froma recruited user to their peer, and right-censoring obser-vations with multiple treated peers to parameterize ourignorance of the effects of multiple treated peers. Thereare multiple approaches to interference in networkedexperiments. We discuss the strengths and weaknessesof these approaches in the next section. Our approach,which represents a design approach to minimizinginterference, maintains a parsimonious view of peereffects and reduces the number of assumptions wehave to make to perform inference on the estimandswe care about. One consequence of such an approachis that although our estimates robustly generalize tolocal peers effects (estimates of the effect of influenceon direct peers), they may not generalize to describethe full complexity of the dynamic spread of socialinfluence in the network (e.g., including direct andfirst-, second-, and third-order—and so on—indirectinfluence effects cascading throughout the network).

Although our experimental design strategy allowsfor causal identification of the moderating impactof tie strength and embeddedness on peer influence,minimizes the effects of interference, and circumventsendogeneity in online communication patterns, it isnot without its limitations. First, our results may havelimited generalizability to other contexts, for example,to cases where there is a significant financial cost toadopting a product or service or for populations ofinactive (or rarely active) users.9 Second, the use offirm-mediated automated (passive) peer-to-peer mes-sages, although advantageous in their ability to breakendogenous online communication patterns, have beenfound to be less effective in inducing adoption on aper-message basis than active personalized messag-ing features and, when overused, may be regardedas spammy or may even be blocked (contingent onplatform-specific options) on a per-user basis. For anin-depth comparison of passive and active messaging,see Aral and Walker (2011a). Third, some forms ofendogeneity may still be present and uncontrolledfor, such as offline word-of-mouth communications

9 The spread of WOM by nonadopters or nonusers of a product mayarise, for example, from preconceived notions concerning a productsutility or by mimicking WOM heard from others. Speculative,mimicked, and spurious WOM is an interesting potential topic forfuture inquiry.

Man

agem

ent S

cien

ce

Page 17: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social Influence16 Management Science, Articles in Advance, pp. 1–19, © 2014 INFORMS

between peers. These other endogenous effects mayimpact the interpretation of our spontaneous adoptionresults. Though prior research suggests that offlineword of mouth (that necessarily requires consciouseffort) is significantly less likely to impact adoption inour specific context (Aral and Walker 2011a), futureresearch into offline influence mechanisms and thecomparison of online to offline influence is encouraged.Future work should also address additional contexts,where the correlation between social structural mea-sures and influence may differ. In addition, focus onalternative social measures, such as structural equiva-lence and brokerage, may yield meaningful insightsinto the dynamics of social influence.

7. Discussion: The Future ofLarge-Scale RandomizedExperiments in Networks

Although recent studies have powerfully demonstratedthe scientific potential and empirical effectiveness of anew paradigm of business analytics based on large-scale digital experimentation, many challenges remainin perfecting the science of these dynamic, often net-worked experiments. First, choosing the sample frameand designing the sampling and/or recruitment strat-egy for networked experiments is essential to avoidingselection bias. Recent work has demonstrated that somesamples based on network propagation methods, suchas respondent driven sampling, can create biased infer-ence because they systematically miss certain types ofedges and nodes (Airoldi et al. 2011). These studies alsoconclude, however, that comprehensive propagationsampling, which samples all edges and peers in eachstep of the snowball propagation (e.g., Aral et al. 2009),are not subject to these biases because they collect com-plete local network information. Furthermore, selectioneffects can create bias when recruited subjects do notrepresent the population of network nodes they intendto represent. Recruitment campaigns must therefore bespecifically designed to avoid recruitment selectionbias (Aral and Walker 2011a). These sampling andrecruitment choices serve as informative examples ofthe types of biases that can be created by well-knownand widely used sampling methods. Researchers musttherefore carefully attend to sampling choices in thedesign of networked experiments and the inferencetechniques they use to evaluate experimental evidence.

Second, the potential for interference, leakage, orcontamination in networked experiments is well docu-mented (Aral and Walker 2011a, Aronow and Samii2011). Since treated nodes are connected to nontreatednodes through complex network structure, even ran-domly assigned treatments can violate the stable unittreatment value assumption (SUTVA) and interferewith one another to create systematic biases. As a

starting point, the likelihood of interference and itseffects on bias and variance can be empirically esti-mated (e.g., Muchnik et al. 2013). Several solutions tointerference in networked experiments are also cur-rently being developed along two primary dimensions:design strategies and inference strategies.

Design strategies attempt to set up network experi-ments so as to minimize the likelihood of interference.For example, as we do in this paper, prior work hasrecruited subjects from large sparse graph structuresand analyzed only local peer effects from a recruiteduser to their peer to minimize the potential for interfer-ence across treatments in local network neighborhoods(e.g., Aral and Walker 2011a, 2012). More recently,clustered graph randomization strategies have beendeveloped to reduce the likelihood of interference.Ugander et al. (2013), for example, propose graph clus-tering to analyze average treatment effects under socialinterference and develop an efficient exact algorithmto compute the probabilities for each node’s networkexposure. Using these probabilities as inverse weights,Ugander et al. (2013) demonstrate that a Horvitz–Thompson estimator can provide an effect estimatethat is unbiased, provided that the exposure modelhas been properly specified. They show that propercluster randomization can lead to exponentially lowerestimator variance when experimentally measuringaverage treatment effects under interference.

Inference strategies, on the other hand, attempt tocorrect interference bias at the point of inference (Araland Walker 2011a, Aronow and Samii 2011, Middletonand Aronow 2011). One approach has been to right-censor contaminated nodes at the point in time afterwhich, during a networked experiment, they are likelyto be exposed to interference (e.g., Aral and Walker2011a). This approach parameterizes the ignoranceof the researcher with regard to the true exposuremodel by limiting inference to conservative effectsthat are likely to be protected from interference. Otherinference strategies, however, begin by hypothesizingan exposure model, and then base inference techniqueson estimands that are unbiased if the exposure modelis accurately specified (e.g., Aronow and Samii 2011,Middleton and Aronow 2011). These techniques assumethe form of interference is known and provide con-servative estimators of the randomization variance ofthe average treatment effects given social interference,relying in part on the accuracy of the hypothesizedexposure model.

Third, power calculations for the design of net-worked experiments are critical and can often spellthe difference between a waste of costly resourcesand the successful detection of meaningful signalsin noisy environments and populations with a largedegree of heterogeneity. Networked environments pose

Man

agem

ent S

cien

ce

Page 18: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social InfluenceManagement Science, Articles in Advance, pp. 1–19, © 2014 INFORMS 17

unique challenges to a priori estimation of statisti-cal power across treatment groups, particularly inthe case where treated population sizes may grow(through diffusion and as a consequence of treatment)and endogenous response to treatment may requirecensoring of subjects to reduce leakage effects or otherkinds of interference that effectively limit statisticalpower. The lack of preexisting estimates of adoptionrates and responses to treatment in the precise con-text being examined are compounded by networkedenvironments that may amplify small discrepanciesbetween rate or response parameter estimates andtheir true values in a nonlinear fashion. When possible,pilot studies provide the cleanest means of assessingstatistical power associated with experimental designs.However, in many circumstances pilot studies are notfeasible due to constraints on timing, resource costs,or limitations set in place by third-party affiliates. Insuch cases, observational data from the system beingstudied may be used to make reasonable assessmentsof context-specific parameter estimates for adoption,spreading, or treatment responses in general (for agood example, see Bapna and Umyarov 2012). In addi-tion, future studies may wish to employ methods ofadaptive or conditional treatment randomization thatassign subjects to treatment groups conditional onthe current state of treatment group sizes or averageresponses to treatment.

Finally, it is important to precisely consider the mech-anism of randomization and its effects on the biases wehave described, in particular its implications for SUTVA,interference, sampling, and power. The mechanism oftreatment randomization in networked experiments cantake many forms. For example, randomized treatmentscan represent changes to the available channels ofinfluence-mediating messages (e.g., Aral and Walker2011a, 2012; Bakshy et al. 2012b; Taylor et al. 2013),changes to network structure (e.g., Centola 2010, Kearnset al. 2006, Mason and Watts 2012, Rand and Nowak2011, Suri and Watts 2011), encouragement of behaviorsin particular nodes (e.g., Bapna and Umyarov 2012),manipulations of available social cues (e.g., Bakshy et al.2012a, Bond et al. 2012, Tucker 2011), manipulations ofincentives for networked propagation (e.g., Aral andTaylor 2014), and randomization of population-levelsocial signals (e.g., Muchnik et al. 2013, Salganik et al.2006). Each of these mechanisms of randomization hasdifferent implications for inference, the characteristicsof resultant samples, the likelihood of interference,the generalizability of parameter estimates, and, morebroadly, the conclusions that can be draw from exper-imental results. Rather than delineate the effects ofeach mechanism, here we simply stress the need fornetworked experimental research to systematicallyconsider the mechanism of randomization in the designand analysis of networked experiments and leave for

future work the detailed assessment of the implicationsof each strategy.

The methods used in this study can be generalized toa wide variety of contexts to further our understandingof social contagions and better inform data-driven deci-sions in several policy domains including marketing,public health, and politics. In general, future workmust be diligent in addressing the unique challengescreated by large-scale networked experimentation. Thecomplexity of the experimental environments in whichsocial science is now possible creates great opportu-nities for social and economic research. But, theseopportunities can only be truly realized if the challengeswe have outlined are appropriately addressed.

8. ConclusionThe availability of microlevel data at population scalehas been recognized as a crucial opportunity in theadvancement of modern business analytics. At the sametime, microlevel experimentation in large networkedenvironments is a new frontier for analytics that hasthe potential to circumvent problematic issues of causalidentification and endogeneity that have hinderedour understanding of the detailed social influenceprocesses involved in the propagation of behaviors andeconomic outcomes in society. Advancement of thescience of social influence is vital to both marketingstrategy and public policy where firms or governmentsseek to leverage social influence to encourage thespread of products or promote positive behaviors whilecurtailing negative ones. Research on social influencehas predominantly focused on whether influence playsa role in the diffusion of a product or behavior andthe relative size of the effect. However, recently thefocus has shifted to examining when and under whatindividual, social, and structural conditions influence isstronger or weaker. This latter focus, which we adoptin this study, is important for policy because it canreveal which interventions (e.g., viral incentives, socialinterventions, targeting or network-based marketing)work best for which relationships.

We conducted a large-scale randomized experimentto identify the impact of tie strength characteristicsand social embeddedness on influence. This workpresents some of the first large-scale experimentalevidence investigating the social and structural mod-erators of peer influence in networks. Results fromour study shed light on the role that relationship andsocial structural characteristics play on influence-basedpropagation. We found that relationship characteris-tics, which capture joint participation in institutionalor common social contexts, were associated with thegreatest influence. For example, individuals exertedan over 13-fold increase in influence over peers withwhom they attended the same college. Some measures

Man

agem

ent S

cien

ce

Page 19: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social Influence18 Management Science, Articles in Advance, pp. 1–19, © 2014 INFORMS

of the recency of social context in the relationshipsbetween individuals and their peers were associatedwith increased influence, whereas others were not. Forexample, individuals exerted an over sixfold increasein influence over peers currently living in the sametown, but did not exert more influence over peers withwhom they coappeared in online photos. Interestingly,measures of tie strength based on common interestswere not associated with influence, though they weregood predictors of preference similarity in the adop-tion of the product we studied. Finally, individualsexhibited greater influence on peers with whom theyshared embedded relationships. This latter effect wasboth subtle and economically significant, highlightingthe importance of large-scale randomized experimentsin detecting nuanced effects in the face of endogeneity,bias, and confounding factors.

AcknowledgmentsThe authors are grateful to seminar participants at theNational Bureau of Economic Research (NBER), the Work-shop on Information in Networks (WIN), the Workshopon Information Systems Economics (WISE), the Interna-tional Conference on Information Systems (ICIS), the WinterConference on Business Intelligence, Facebook, New YorkUniversity, Massachusetts Institute of Technology, and theUniversity of Minnesota for valuable comments. Sinan Aralgratefully acknowledges generous funding and support fromthe National Science Foundation [Career Award 0953832].

ReferencesAiroldi E, Bai X, Carley K (2011) Network sampling and classification:

An investigation of network model representations. DecisionSupport Systems 51(3):506–518.

Aral S (2011) Commentary—Identifying social influence: A commenton opinion leadership and social contagion in new productdiffusion. Marketing Sci. 30(2):217–223.

Aral S, Taylor S (2014) Viral incentive systems: A randomized fieldexperiment. Working paper, Stern School of Business, New YorkUniversity, New York.

Aral S, Van Alstyne MW (2011) The diversity-bandwidth tradeoff.Amer. J. Sociology 117(1):90–171.

Aral S, Walker D (2011a) Creating social contagion through viralproduct design: A randomized trial of peer influence in networks.Management Sci. 57(9):1623–1639.

Aral S, Walker D (2011b) Identifying social influence in networksusing randomized experiments. IEEE Intelligent Systems 26(5):91–96.

Aral S, Walker D (2012) Identifying influential and susceptiblemembers of social networks. Science 337(6092):337–341.

Aral S, Muchnik L, Sundararajan A (2009) Distinguishing influence-based contagion from homophily-driven diffusion in dynamicnetworks. Proc. Natl. Acad. Sci. USA 106(51):21544–21549.

Aral S, Muchnik L, Sundararajan A (2013) Engineering social conta-gions: Optimal network seeding and incentive strategies. NetworkSci. 1(2):1–29.

Arndt J (1967) Role of product-related conversations in the diffusionof a new product. J. Marketing Res. 4(3):291–295.

Aronow PM, Samii C (2011) Estimating average causal effects undergeneral interference. Working paper, Yale University, NewHaven, CT.

Bakshy E, Eckles D, Yan R, Rosenn I (2012a) Social influence in socialadvertising: Evidence from field experiments. Proc. 13th ACMConf. Electronic Commerce (ACM, New York), 146–161.

Bakshy E, Rosenn I, Marlow C, Adamic L (2012b) The role of socialnetworks in information diffusion. Proc. 21st Internat. Conf. WorldWide Web (ACM, New York), 519–528.

Banerjee AV (1992) A simple model of herd behavior. Quart. J. Econom.107(3):797–817.

Bapna R, Umyarov A (2012) Are paid subscriptions on music socialnetworks contagious? A randomized field experiment. Workingpaper, University of Minnesota, Minneapolis.

Bapna R, Gupta A, Rice S, Sundararajan A (2011) Trust, reciprocityand the strength of social ties: An online social network basedfield experiment. Working paper, University of Minnesota,Minneapolis.

Bapna R, Ramaprasad J, Shmueli G, Umyarov A (2012) One-waymirrors and weak-signaling in online dating: A randomizedfield experiment. Working paper, University of Minnesota,Minneapolis.

Berger JA, Milkman KL (2009) Social transmission, emotion, andthe virality of online content. Working paper, University ofPennsylvania, Philadelphia.

Bikhchandani S, Hirshleifer D, Welch I (1998) Learning from thebehavior of others: Conformity, fads, and informational cascades.J. Econom. Perspectives 12(3):151–170.

Bond RM, Fariss CJ, Jones JJ, Kramer ADI, Marlow C, Settle JE,Fowler JH (2012) A 61-million-person experiment in socialinfluence and political mobilization. Nature 489(7415):295–298.

Boudreau KJ, Lakhani KR (2011) High incentives, sorting on skills—Or just a “taste” for competition? Field experimental evidencefrom an algorithm design contest. Working paper, LondonBusiness School, London.

Brown JJ, Reingen PH (1987) Social ties and word-of-mouth referralbehavior. J. Consumer Res. 14(3):350–362.

Burt RS (2005) Brokerage and Closure: An Introduction to Social Capital(Oxford University Press, New York).

Centola D (2010) The spread of behavior in an online social networkexperiment. Science 329(5996):1194–1197.

Centola D (2011) An experimental study of homophily in the adoptionof health behavior. Science 334(6060):1269–1272.

Coleman J (1988) Free riders and zealots: The role of social networks.Sociol. Theory 6(1):52–57.

Coleman J, Katz E, Menzel H (1957) The diffusion of an innovationamong physicians. Sociometry 20(4):253–270.

De Bruyn A, Lilien G (2008) A multi-stage model of word-of-mouthinfluence through viral marketing. Internat. J. Res. Marketing25(3):151–163.

Eagle N, Macy M, Claxton R (2010) Network diversity and economicdevelopment. Science 328(5981):1029–1031.

Easley D, Kleinberg J (2010) Networks, Crowds, and Markets (CambridgeUniversity Press, Cambridge, UK).

Eckles D, Bakshy E (2014) Bias in observational studies of socialcontagion. Working paper, Data Science Team, Facebook, MenloPark, CA.

Engel JF, Kegerreis RJ, Blackwell RD (1969) Word-of-mouth commu-nication by the innovator. J. Marketing 33(3):15–19.

Fowler JH, Christakis NA (2010) Cooperative behavior cascadesin human social networks. Proc. Natl. Acad. Sci. USA 107(12):5334–5338.

Frenzen JK, Davis HL (1990) Purchasing behavior in embeddedmarkets. J. Consumer Res. 17(1):1–12.

Gilly MC, Graham JL, Wolfinbarger MF, Yale LJ (1998) A dyadicstudy of interpersonal information search. J. Acad. Marketing Sci.26(2):83–100.

Gladwell M (2002) The Tipping Point: How Little Things Can Make aBig Difference (Back Bay Books, New York).

Man

agem

ent S

cien

ce

Page 20: Tie Strength, Embeddedness, and Social Influence: A Large ... Strength, Embeddedness and...Aral and Walker: Tie Strength, Embeddedness, and Social Influence 2 Management Science,

Aral and Walker: Tie Strength, Embeddedness, and Social InfluenceManagement Science, Articles in Advance, pp. 1–19, © 2014 INFORMS 19

Godes D, Mayzlin D (2004) Using online conversations to studyword-of-mouth communication. Marketing Sci. 23(4):545–560.

Godes D, Mayzlin D, Chen Y, Das S, Dellarocas C, Pfeiffer B, Libai B,Sen S, Shi M, Verlegh P (2005) The firm’s management of socialinteractions. Marketing Lett. 16(3/4):415–428.

Golder S, Macy M (2011) Diurnal and seasonal mood vary withwork, sleep, and daylength across diverse cultures. Science333(6051):1878–1881.

Granovetter M (1973) The strength of weak ties. Amer. J. Sociology78(6):1360–1380.

Granovetter M (1978) Threshold models of collective behavior. Amer. J.Sociology 83(6):1420–1443.

Granovetter M (1983) The strength of weak ties: A network theoryrevisited. Sociol. Theory 1:201–233.

Granovetter M (1985) Economic action and social structure: Theproblem of embeddedness. Amer. J. Sociology 91(3):481–510.

Hinz O, Skiera B, Barrot C, Becker JU (2011) Seeding strategiesfor viral marketing: An empirical comparison. J. Marketing75(6):55–71.

Iyengar R, Van den Bulte C, Valente TW (2011) Opinion leadershipand social contagion in new product diffusion. Marketing Sci.30(2):195–112.

Katz E, Lazarsfeld P (1955) Personal Influence: The Part Played byPeople in the Flow of Mass Communications (Transaction Publishers,Piscataway, NJ).

Kearns M, Suri S, Montfort N (2006) An experimental study ofthe coloring problem on human subject networks. Science313(5788):824–827.

Leider S, Möbius MM, Rosenblat T, Do Q-A (2009) Directed altruismand enforced reciprocity in social networks. Quart. J. Econom.124(4):1815–1851.

Manchanda P, Xie Y, Youn N (2008) The role of targeted commu-nication and contagion in product adoption. Marketing Sci.27(6):961–976.

Manski CF (1993) Identification problems in the social sciences. Sociol.Methodology 23:1–56.

Mason W, Watts DJ (2012) Collaborative learning in networks. Proc.Natl. Acad. Sci. USA 109(3):764–769.

Mayer-Schonberger V, Cukier K (2013) Big Data: A Revolution ThatWill Transform How We Live, Work, and Think, 1st ed. (EamonDolan/Houghton Mifflin Harcourt, Boston).

Middleton JA, Aronow PM (2011) Unbiased estimation of the averagetreatment effect in cluster-randomized experiments. Workingpaper, New York University, New York.

Muchnik L, Aral S, Taylor S (2013) Social influence bias: A random-ized experiment. Science 341(6146):647–651.

Nam S, Manchanda P, Chintagunta PK (2010) The effect of signalquality and contiguous word of mouth on customer acquisitionfor a video-on-demand service. Marketing Sci. 29(4):690–700.

Rand DG, Nowak MA (2011) The evolution of antisocial punishmentin optional public goods games. Nature Comm. 2:TArticle 434.

Reagans R, McEvily B (2003) Network structure and knowledgetransfer: The effects of cohesion and range. Admin. Sci. Quart.48(2):240–267.

Rogers EM (2003) Diffusion of Innovations, 5th ed. (Free Press,New York).

Salganik MJ, Dodds PS, Watts DJ (2006) Experimental study ofinequality and unpredictability in an artificial cultural market.Science 311(5762):854–856.

Shalizi CR, Thomas AC (2011) Homophily and contagion are generi-cally confounded in observational social network studies. Sociol.Methods Res. 40(2):211–239.

Sundararajan A, Provost F, Oestreicher-Singer G, Aral S (2013)Information in digital, economic, and social networks. Inform.Systems Res. 24(4):883–905.

Suri S, Watts DJ (2011) Cooperation and contagion in web-based,networked public goods experiments. PLoS One 6(3):e16836.

Taylor S, Bakshy E, Aral S (2013) Selection effects in online sharing:Consequences for peer adoption. 14th ACM Conf. ElectronicCommerce, Philadelphia.

Tucker C (2008) Identifying formal and informal influence in tech-nology adoption with network externalities. Management Sci.54(12):2024–2038.

Tucker C (2011) Social advertising. Working paper, MIT Sloan Schoolof Management, Cambridge, MA.

Tucker C, Zhang J (2010) Growing two-sided networks by adver-tising the user base: A field experiment. Marketing Sci. 29(5):805–814.

Tucker C, Zhang J (2011) How does popularity information affectchoices? A field experiment. Management Sci. 57(5):828–842.

Ugander J, Backstrom L, Marlow C, Kleinberg J (2012) Structuraldiversity in social contagion. Proc. Natl. Acad. Sci. USA 109(16):5962–5966.

Ugander J, Karrer B, Backstrom L, Kleinberg J (2013) Graph clusterrandomization: Network exposure to multiple universes. 19thACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining(KDD) (ACM, New York), 329–337.

Uzzi B (1996) The sources and consequences of embeddedness forthe economic performance of organizations: The network effect.Amer. Sociol. Rev. 61(4):674–698.

Uzzi B (1997) Social structure and competition in interfirm networks:The paradox of embeddedness. Admin. Sci. Quart. 42(1):35–67.

Valente T (1996) Social network thresholds in the diffusion ofinnovations. Soc. Networks 18(1):69–89.

Valente TW (1995) Network Models of the Diffusion of Innovations(Quantitative Methods in Communication Subseries) (HamptonPress, NJ).

Van den Bulte C, Joshi YV (2007) New product diffusion withinfluentials and imitators. Marketing Sci. 26(3):400–421.

Van den Bulte C, Lilien G (2001) Medical innovation revisited:Social contagion versus marketing effort. Amer. J. Sociology106(5):1409–1435.

Watts DJ, Dodds PS (2007) Influentials, networks, and public opinionformation. J. Consumer Res. 34(4):441–458.

Zhang J (2010) The sound of silence: Observational learning in theU.S. kidney market. Marketing Sci. 29(2):315–335.

Man

agem

ent S

cien

ce