42
Graph Theory for Online Advertising J. Tipan Verella March 19, 2014 Tipan GTOA March 19, 2014 1 / 18

Graph Theory for Online Advertising

  • View
    191

  • Download
    7

Embed Size (px)

DESCRIPTION

Presentation by J. Tipan Verella from Millennial Media at the Insight Summit Series: 2014 Digital Advertising + Marketing Summit

Citation preview

Page 1: Graph Theory for Online Advertising

Graph Theory for Online Advertising

J. Tipan Verella

March 19, 2014

Tipan GTOA March 19, 2014 1 / 18

Page 2: Graph Theory for Online Advertising

Introduction

What is so great about Graphs?

A GraphG = (V ;E ) is a pair of sets, vertices and edges.

Degree of Vertex, Connected Components

SystemsEngineering for Complex Behavioral Systems

bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent

Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.

What does it have to do with online advertising?

Tipan GTOA March 19, 2014 2 / 18

Page 3: Graph Theory for Online Advertising

Introduction

What is so great about Graphs?

A GraphG = (V ;E ) is a pair of sets, vertices and edges.

Degree of Vertex, Connected Components

SystemsEngineering for Complex Behavioral Systems

bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent

Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.

What does it have to do with online advertising?

Tipan GTOA March 19, 2014 2 / 18

Page 4: Graph Theory for Online Advertising

Introduction

What is so great about Graphs?

A GraphG = (V ;E ) is a pair of sets, vertices and edges.

Degree of Vertex, Connected Components

SystemsEngineering for Complex Behavioral Systems

bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent

Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.

What does it have to do with online advertising?

Tipan GTOA March 19, 2014 2 / 18

Page 5: Graph Theory for Online Advertising

Introduction

What is so great about Graphs?

A GraphG = (V ;E ) is a pair of sets, vertices and edges.

Degree of Vertex, Connected Components

SystemsEngineering for Complex Behavioral Systems

bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent

Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.

What does it have to do with online advertising?

Tipan GTOA March 19, 2014 2 / 18

Page 6: Graph Theory for Online Advertising

Introduction

What is so great about Graphs?

A GraphG = (V ;E ) is a pair of sets, vertices and edges.

Degree of Vertex, Connected Components

SystemsEngineering for Complex Behavioral Systems

bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent

Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.

What does it have to do with online advertising?

Tipan GTOA March 19, 2014 2 / 18

Page 7: Graph Theory for Online Advertising

Introduction

Anecdotes from the Industry

Facebook Presto 2013, Demonstrating the Scalability of Presto1

Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2

Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3

Google Pregel (2010) A System for Large-Scale Graph Processing4

Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.

1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920

2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184

Tipan GTOA March 19, 2014 3 / 18

Page 8: Graph Theory for Online Advertising

Introduction

Anecdotes from the Industry

Facebook Presto 2013, Demonstrating the Scalability of Presto1

Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2

Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3

Google Pregel (2010) A System for Large-Scale Graph Processing4

Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.

1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920

2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184

Tipan GTOA March 19, 2014 3 / 18

Page 9: Graph Theory for Online Advertising

Introduction

Anecdotes from the Industry

Facebook Presto 2013, Demonstrating the Scalability of Presto1

Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2

Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3

Google Pregel (2010) A System for Large-Scale Graph Processing4

Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.

1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920

2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184

Tipan GTOA March 19, 2014 3 / 18

Page 10: Graph Theory for Online Advertising

Introduction

Anecdotes from the Industry

Facebook Presto 2013, Demonstrating the Scalability of Presto1

Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2

Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3

Google Pregel (2010) A System for Large-Scale Graph Processing4

Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.

1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920

2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184

Tipan GTOA March 19, 2014 3 / 18

Page 11: Graph Theory for Online Advertising

Introduction

Anecdotes from the Industry

Facebook Presto 2013, Demonstrating the Scalability of Presto1

Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2

Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3

Google Pregel (2010) A System for Large-Scale Graph Processing4

Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.

1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920

2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184

Tipan GTOA March 19, 2014 3 / 18

Page 12: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

Performance Advertising

Advertiser would prefer to only pay for actions

Publisher would prefer to only charge on views (impressions)

Tipan GTOA March 19, 2014 4 / 18

Page 13: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The Advertiser Problem

j is the proportion of your budget you spend on site j

Nj( j) are the impressions procured by spending j on site j

�j is the conversion rate of your ad on site j

max

∑j2J

Nj( j) � �j︸ ︷︷ ︸Actionsj

subject to:∑j

j � Budget

Tipan GTOA March 19, 2014 5 / 18

Page 14: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The Advertiser Problem

j is the proportion of your budget you spend on site j

Nj( j) are the impressions procured by spending j on site j

�j is the conversion rate of your ad on site j

max

∑j2J

Nj( j) � �j︸ ︷︷ ︸Actionsj

subject to:∑j

j � Budget

Tipan GTOA March 19, 2014 5 / 18

Page 15: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The Publisher Problem

�(i ; n) is the revenue if impression n is awarded to advertiser i

�i ;n is 1 or 0 depending on whether or not impression n isawarded to advertiser i

I is the set of advertisers

max�i;n

∑n2N

∑i2I

�(i ; n) � �i ;n

subject to:∑i2I

�i ;n � 1 8n

Tipan GTOA March 19, 2014 6 / 18

Page 16: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The Publisher Problem

�(i ; n) is the revenue if impression n is awarded to advertiser i

�i ;n is 1 or 0 depending on whether or not impression n isawarded to advertiser i

I is the set of advertisers

max�i;n

∑n2N

∑i2I

�(i ; n) � �i ;n

subject to:∑i2I

�i ;n � 1 8n

Tipan GTOA March 19, 2014 6 / 18

Page 17: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The AdNetwork Problem

�i ;j is the fraction of the inventory on site j allocated toadvertiser i

Nj are the total number of impressions from site j�i ;j is the conversion rate of advertiser i on site j�(i) is the amount paid per conversion by advertiser icj is the cost per impression on site jBi is the budget of advertiser i

max�

∑i2I

∑j2J

0B@�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸

revenue

cost︷ ︸︸ ︷cj � Nj

1CA

subject to:∑j2J

�i ;j � Nj � �i ;j � �(i) � Bi 8i 2 I

∑i2I

�i ;j � 1 8j 2 J

Tipan GTOA March 19, 2014 7 / 18

Page 18: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The AdNetwork Problem

�i ;j is the fraction of the inventory on site j allocated toadvertiser i

Nj are the total number of impressions from site j�i ;j is the conversion rate of advertiser i on site j�(i) is the amount paid per conversion by advertiser icj is the cost per impression on site jBi is the budget of advertiser i

max�

∑i2I

∑j2J

0B@�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸

revenue

cost︷ ︸︸ ︷cj � Nj

1CA

subject to:∑j2J

�i ;j � Nj � �i ;j � �(i) � Bi 8i 2 I

∑i2I

�i ;j � 1 8j 2 J

Tipan GTOA March 19, 2014 7 / 18

Page 19: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The Centralized Approach: Linear Programming

max�

∑i2I

∑j2J

0B@�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸

revenue

cost︷ ︸︸ ︷cj � Nj

1CA

subject to:∑j2J

�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸spend of advertiser i

� Bi 8i 2 I

∑i2I

�i ;j � 1 8j 2 J

Plan, Evaluate, Update

Duality can says a lot about the structure of your problem

DOES NOT SCALE!

Tipan GTOA March 19, 2014 8 / 18

Page 20: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The Centralized Approach: Linear Programming

max�

∑i2I

∑j2J

0B@�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸

revenue

cost︷ ︸︸ ︷cj � Nj

1CA

subject to:∑j2J

�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸spend of advertiser i

� Bi 8i 2 I

∑i2I

�i ;j � 1 8j 2 J

Plan, Evaluate, Update

Duality can says a lot about the structure of your problem

DOES NOT SCALE!

Tipan GTOA March 19, 2014 8 / 18

Page 21: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The Decentralized Approach: The Market Paradigm

Publisher runs auctions, the good (impressions) goes to the agentthat values it the most 5

the monopoly should provide as detailed a description ofthe good as possiblethe auction solves the allocation problem

Advertiser places bids, 2nd price auction it is optimal to bid yourvaluation

valuation depends on conversion rates, a priori unknown!the number of auctions is also unknown!

performance rates have to be estimatedcontrol algorithms have to be implemented in order topace the delivery of the ad campaign

Markets are complex systems!

5Hal Varian on the Online Ad AuctionTipan GTOA March 19, 2014 9 / 18

Page 22: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The Decentralized Approach: The Market Paradigm

Publisher runs auctions, the good (impressions) goes to the agentthat values it the most 5

the monopoly should provide as detailed a description ofthe good as possiblethe auction solves the allocation problem

Advertiser places bids, 2nd price auction it is optimal to bid yourvaluation

valuation depends on conversion rates, a priori unknown!the number of auctions is also unknown!performance rates have to be estimatedcontrol algorithms have to be implemented in order topace the delivery of the ad campaign

Markets are complex systems!

5Hal Varian on the Online Ad AuctionTipan GTOA March 19, 2014 9 / 18

Page 23: Graph Theory for Online Advertising

Strategy and Structure Optimization Problems in Online Performance Advertising

The Decentralized Approach: The Market Paradigm

Publisher runs auctions, the good (impressions) goes to the agentthat values it the most 5

the monopoly should provide as detailed a description ofthe good as possiblethe auction solves the allocation problem

Advertiser places bids, 2nd price auction it is optimal to bid yourvaluation

valuation depends on conversion rates, a priori unknown!the number of auctions is also unknown!performance rates have to be estimatedcontrol algorithms have to be implemented in order topace the delivery of the ad campaign

Markets are complex systems!

5Hal Varian on the Online Ad AuctionTipan GTOA March 19, 2014 9 / 18

Page 24: Graph Theory for Online Advertising

Strategy and Structure Graphs and Behavior

More About Graphs: Random Graphs

Let V be a vertex set, with |V | = n.

For each pair of vertices (u; v), with u; v 2 V , we decide to put theedge (u; v) based on the outcome of a coin flip, with probabilityp = c

n .

Tipan GTOA March 19, 2014 10 / 18

Page 25: Graph Theory for Online Advertising

Strategy and Structure Graphs and Behavior

Erdos and Renyi

Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.

Figure : as c goes from < 1 to > 1

6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18

Page 26: Graph Theory for Online Advertising

Strategy and Structure Graphs and Behavior

Erdos and Renyi

Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.

Figure : as c goes from < 1 to > 1

6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18

Page 27: Graph Theory for Online Advertising

Strategy and Structure Graphs and Behavior

Erdos and Renyi

Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.

Figure : as c goes from < 1 to > 1

6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18

Page 28: Graph Theory for Online Advertising

Strategy and Structure Graphs and Behavior

Erdos and Renyi

Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.

Figure : as c goes from < 1 to > 1

6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18

Page 29: Graph Theory for Online Advertising

Strategy and Structure Graphs and Behavior

Erdos and Renyi

Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.

Figure : as c goes from < 1 to > 1

6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18

Page 30: Graph Theory for Online Advertising

Strategy and Structure Graphs and Behavior

Local Interactions in the Quantitative Social Sciences

Sociologist, Mark Granovetter: The Strength of Weak Ties (1973)Economists: predictive power of social interactions

Lawrence Blumef (1993), propose using model from statisticalmechanics to understand strategic interactionsEdward Gleaser EtAl 1996, Crime and Social InteractionsSteven Durlauf (1999) asks in PNAS, How can statistical mechanicscontribute to social science?H. Peyton Young 2001, Individual Strategy and Social Structure: AnEvolutionary Theory of Institutions

by 1996, Social Network Analysis: Methods and Applications by Faustand Wasserman.

More recently sociolgists at Cornell University have been using graphbased sampling methods 7 to do estimations for hidden populations

sociologists like A.V. Papachristos have been using social networks tounderstand the crime in Chicago8.

7responsdent driven sampling8Social Networks and Gang Violence

Tipan GTOA March 19, 2014 12 / 18

Page 31: Graph Theory for Online Advertising

Strategy and Structure Graphs and Behavior

Local Interactions in the Quantitative Social Sciences

Sociologist, Mark Granovetter: The Strength of Weak Ties (1973)Economists: predictive power of social interactions

Lawrence Blumef (1993), propose using model from statisticalmechanics to understand strategic interactionsEdward Gleaser EtAl 1996, Crime and Social InteractionsSteven Durlauf (1999) asks in PNAS, How can statistical mechanicscontribute to social science?H. Peyton Young 2001, Individual Strategy and Social Structure: AnEvolutionary Theory of Institutions

by 1996, Social Network Analysis: Methods and Applications by Faustand Wasserman.

More recently sociolgists at Cornell University have been using graphbased sampling methods 7 to do estimations for hidden populations

sociologists like A.V. Papachristos have been using social networks tounderstand the crime in Chicago8.

7responsdent driven sampling8Social Networks and Gang Violence

Tipan GTOA March 19, 2014 12 / 18

Page 32: Graph Theory for Online Advertising

Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!

CrowdSourcing: Power to the People!

Yochai Benkler on Directories and GooglePageRank

channels/categories/directories,

advertisers/campaigns/creatives

Tipan GTOA March 19, 2014 13 / 18

Page 33: Graph Theory for Online Advertising

Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!

Site Networks and Audiences

Tipan GTOA March 19, 2014 14 / 18

Page 34: Graph Theory for Online Advertising

Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!

Site Networks and Audiences

Tipan GTOA March 19, 2014 14 / 18

Page 35: Graph Theory for Online Advertising

Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!

Community Detection

Why understand community structures of complex networks?

Size, problem reduction

Topology, diverse degree distribution

Biological Sciences Perspective:

network enables the discovery of organization interactions of abio-chemical system

Complex Networks as backbone of Complex Systems

Communities enable decomposition into subsystems, modules

In online advertising: Feature Extraction!

Tipan GTOA March 19, 2014 15 / 18

Page 36: Graph Theory for Online Advertising

Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!

Community Detection

Why understand community structures of complex networks?

Size, problem reduction

Topology, diverse degree distribution

Biological Sciences Perspective:

network enables the discovery of organization interactions of abio-chemical system

Complex Networks as backbone of Complex Systems

Communities enable decomposition into subsystems, modules

In online advertising: Feature Extraction!

Tipan GTOA March 19, 2014 15 / 18

Page 37: Graph Theory for Online Advertising

Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!

Community Detection

Why understand community structures of complex networks?

Size, problem reduction

Topology, diverse degree distribution

Biological Sciences Perspective:

network enables the discovery of organization interactions of abio-chemical system

Complex Networks as backbone of Complex Systems

Communities enable decomposition into subsystems, modules

In online advertising: Feature Extraction!

Tipan GTOA March 19, 2014 15 / 18

Page 38: Graph Theory for Online Advertising

Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!

The Pinned Random Walk

Definition (PRW)

Let G = (V ;E ) be a connected undirected graph. Let P be the transition

probability matrix induced by the incidence matrix, Pij =Eij∑j Eij

. Let �0

be a probability measure on V and � 2 (0; 1). We call an pinned randomwalk the discrete time stochastic process, Xk , on G that changes measures�k on V according to:

X0 = x0; almost surely

�k = ��k−1P + (1 − �)�0 (1)

Tipan GTOA March 19, 2014 16 / 18

Page 39: Graph Theory for Online Advertising

Conclusion

So . . . What is so great about Networks?

Coming out of the woodworks of the systems you deal with withinonline advertising, because your systems are Complex!

They are the underlying structures of you advertising systems

They are predictive!

Statisticians are actively working on tools to extract information fromthose rich strutures.

Tipan GTOA March 19, 2014 17 / 18

Page 40: Graph Theory for Online Advertising

Conclusion

Thank You!

Millennial Media

Rosalee MacKinnon

Rick Daggett

Dr. Jean M. Grow

Tipan GTOA March 19, 2014 18 / 18

Page 41: Graph Theory for Online Advertising

Conclusion

Thank You!

Millennial Media

Rosalee MacKinnon

Rick Daggett

Dr. Jean M. Grow

Tipan GTOA March 19, 2014 18 / 18

Page 42: Graph Theory for Online Advertising

Conclusion

Thank You!

Millennial Media

Rosalee MacKinnon

Rick Daggett

Dr. Jean M. Grow

Tipan GTOA March 19, 2014 18 / 18