Large-Scale Network Dynamics: A New Frontier

  • View

  • Download

Embed Size (px)


Large-Scale Network Dynamics: A New Frontier. Jie Wang Dept of Computer Science University of Massachusetts Lowell. Presented at Dept. of Computer Science, Boston University, Nov. 6, 2009 At Dept. of Computer Science, University of Texas at Dallas, Oct. 30, 2009 - PowerPoint PPT Presentation

Text of Large-Scale Network Dynamics: A New Frontier

  • Large-Scale Network Dynamics: A New FrontierJie WangDept of Computer ScienceUniversity of Massachusetts Lowell

    Presented at Dept. of Computer Science, Boston University, Nov. 6, 2009At Dept. of Computer Science, University of Texas at Dallas, Oct. 30, 2009At Dept. of Electrical and Computer Engineering, Michigan State Univ., Sept. 24, 2009

  • The earth to be spannd, connected by network,The races, neighbors, to marry and be given in marriage,The oceans to be crossd, the distant brought near,The lands to be welded together

    Walt Whitman (1819 - 1892), Passage to India The network is the computerJohn Gage (1942 - ), Sun MicrosystemsThe network is the informationand the storageWeibo Gong, UMass Amherst

  • Small-World PhenomenonTwo persons are linked if they are coauthors of an article. The Erds number is the collaboration distance with mathematician Paul Erds.Six degrees of separationWhat is your Erds number? Erds number 0 --- 1 person Erds number 1 --- 504 people Erds number 2 --- 6593 people Erds number 3 --- 33605 people Erds number 4 --- 83642 people Erds number 5 --- 87760 people Erds number 6 --- 40014 people Erds number 7 --- 11591 people Erds number 8 --- 3146 people Erds number 9 --- 819 people Erds number 10 --- 244 people Erds number 11 --- 68 people Erds number 12 --- 23 people Erds number 13 --- 5 people The median Erds number is 5; the mean is 4.65, and the standard deviation is 1.21

  • Small-World NetworksThe Watts-Strogatz -Modelbetween order and randomness- Short mean path; or short characteristic path- Large clustering coefficient

  • What Are Big-World Networks? Acquaintance Networks over GenerationsFrom Mathematics Genealogy ProjectGottfried Leibniz(1646-1716)Jacob Bernoulli(1654-1705)Johann Bernoulli(1667-1748)Leonhard Euler(1707-1783)Joseph Lagrange(1736-1813)Simeon Poisson(1781-1840)Michel Chasles(1793-1880)H. A. Newton(1830-1896)E. H. Moore(1862-1932)Oswald Veblen(1880-1960)Alonzo Church(1903-1995)John B. Rosser(1907-1989)

    Gerald Sacks(1933 -)343 academicdescendantsStephen Homer

    Jie Wang

  • Scale-Free PhenomenonPower law distribution:f(x) ~ x

    Log-log scale:log f(x) ~ log xScale-free networks are small-wolrdSmall-world may not be scale-freeSubnets of scale-free networks may not be scale-free

  • Brain NetworksA mental state M is nothing other than brain state B. The mental state "desire for a cup of coffee" would thus be nothing more than the "firing of certain neurons in certain brain regions. -- E. G. Boring (1886-1968)

  • Are Brain Networks Small-World?Brian networks are highly dynamicCan process 100 trillion instructions per secondSome believe brain networks are small-worldMathematical challenge: Work out a mathematical model consistent with brain functionalitiesThere are 100 billion (1011) neurons in the human brain, and 100 trillion (1014) connections (synapses)

  • Connecting the DotsNetworks are connected dotsYou can't connect the dots looking forward; you can only connect them looking backwards.

    Steven Jobs (1955 -)

  • Infectious Disease SpreadingHow Were Dots Connected?Sept 05 Sept 12, 2009Sept 12 Sept 19, 2009Sept 19 Sept 26, 2009Sept 26 Oct 03, 2009Oct 03 Oct 10, 2009Oct 10 Oct 17, 2009

  • How Will the Dots Be Connected?Dynamic connections are not deterministic, nor random. But they have patterns and trends.

    Statistical analysis is like connecting the dots backward, while predicting disease spread is like connecting the dots forward

  • A Simple Relational Model: The SIR DynamicsStructure-biased k-acquaintance model

    Homophily: the tendency to associate with people like yourself Symmetry: undirected links Triad closure: the tendency of ones acquaintances to also be acquainted with each otherAn 8-acquaitance nodeunder SIR

  • Structure-Biased Spread

  • A Mathematical Model of Spread Prediction

  • Mathematical EpidemiologyMost mathematical methods study differential equations based on simplified assumptions of uniform mixing or ad hoc contact processesExample:

  • Percolation and OutbreakLarge-scale graphs based on scale-free and small-world models are common platforms to study epidemics

    Individuals (sites) are connected by social contacts (bonds)

    Each site is susceptible with probability p and each bond is open with probability q, indicating infectiousness

    A percolation threshold exists for phase transition of disease spread When both p and q are high, a cluster of infectious sites connected by open bonds will permeate the entire population, resulting in an outbreak

    Otherwise, infectious clusters will be small and isolated

  • Percolation Threshold Demo65 x 65 gridq = 0.2q = 0.51q = 0.578

  • Modeling ChallengesPopulation and demographicsurban, suburban, rural, mobilityincome, age, gender, education, religion, culture, ethnic background, household size Social contact patternhousehold, work, study, shopping, entertainment, travel, medical activities, dense and frequent local contacts; sparse and occasional long-distance contactsInfection processdisease characteristics: infectious speed & recovery levelspeople's general health level and vaccination historyfrequency and duration of contactsB. Liu and J. Wang et alIt seems difficult to address these challenges using mathematical methods alone

  • Computational MethodsSimulations with contingent parametersModeling disease outbreaks in realistic urban social networks (S. Eubank et al. Nature, 2004)Understanding the spreading patterns of mobile phone viruses (P. Wang et al., Science, 2009)BT susceptible phones within the range of an infected BT phone will all be infected. An MMS virus can infect all susceptible phones whose numbers are in the phonebook of an infected phone

  • Mobile Networks and OSesLocation, mobility, and communication pattern dynamics

  • Online Social Networks (OSNs)Topological dynamicstemporal attribute of node and edge arrivals and departuresexplain why the mean degree and characteristic path length tend to be stable over time, while density and scale do not Communication dynamicsfriendships vs. activitiesMobility dynamicsGPS-enabled smartphoneslocation-based applications G. Chen, B. Liu, J. Wang et al

  • The Rise of OSNs1997: SixDegrees allowed users to create profiles, list and surf and friend lists

    1997-2001: a number of community tools support profile and friend lists, AsianAvenue, BlackPlanet, MiGente, LiveJournal

    2001 - present : business and professional social network emerged, Ryze, LinkedIn

    2003: MySpace attracts teens, bands, among others and grows to largest OSN

    2004: Facebook designed for college networking (Harvard), expanded to other colleges, high schools, and other individuals

  • Common OSNs

  • OSNs Go MobileLocation aware GPS-enabled phones, sharing current location, availability, attaching location to user-generated contentOutlookanticipated $3.3 billion revenue by 2013Dodgeball, Loopt, Brightkite, Whrrl, Google Latitude, Foursquare

  • PageRank for Measuring Page PopularityBiased Random WalksJust walk at random?

  • Association Rank for Friendship PredictionG. Chen and J. Wang et al

  • Startup in 2005, Denver, CO; opened to public: 2008User activitiesCheck in, status update, photo uploadAll attached with current locationUpdates through SMS, Email, Web, iPhone Social graph with mutual connectionSee your friends or local activity streams

  • Data TraceBrightkite Web APIs

    12/9/08-1/9/09: 18,951 active users

    Back traced to 3/21/08: 1,505,874 updates

    Profile: age, gender, tags, friends list

    Social graph: 41,014 nodes and 46,172 links

    Testing data: next 45 days had 5,098 new links addedG. Chen and N. Li

  • Snapshots taken from 12/09/08 to 01/09/09

  • Three Attributes to Measure Community RankTagsSocial DistanceLocation

  • Probability Measure

  • Tag Graph Metric

  • Social Distance

  • Location Metric

  • Community Rank ValueIndicating the likelihood of friendship

  • ROC Curve

  • MySpaceLaunched in Santa Monica, CA, in 2003

    Grew rapidly and attracted Friendsters users, bands,

    Teenagers began joining en masse in 2004

    Three distinct populations began to form: musicians/artists teenagers post-college urban social crowd

    Purchased by News Corporation for $580M in 2005

    Arguably the largest online social network site

  • MySpace Profile and ActivitiesEach profile: age, gender, location, last login time, etc; identified by a unique ID

    Some profiles claim neutral gender, e.g, bands

    Profiles can be set to private (default is public)

    What can users do?

    search and add friends to their friend listspost messages to friends blog space

    Only friends have access to private profiles friend list and blog space

    Other functions: IM/Call, Block/Rank User, Add to Group favorite

  • Measurement: SnailCrawlerGenerate random IDs uniformly between 1 and max (1,500,000,000)

    Many IDs are not occupied (invalid)Retrieve profile information from MySpace (HTTP)name, ID, gender, age, location, public/private/custom

    other information for public profiles: company, religion, marriage, children, smoke/drink, orientation, zodiac, education, ethnicity, occupation, hometown, body-type, mood, last login, W. Gauvin, B. Liu, X. Fu, J. Wang et al

  • Data TraceScanned: 3,090,016Blogs: 67,045

    People of 16 years old or younger are protected by law

    Teenagers and twenties post most blogs

    False age