2Yahoo! Research
Reasons for you to exit now …
• I gave an early version of this talk at the Stanford InfoLab seminar in Feb
• This talk is essentially identical to the one I gave at STOC 2006 a month ago
3Yahoo! Research
What is web search?
• Access to “heterogeneous”, distributed information– Heterogeneous in creation– Heterogeneous in accuracy– Heterogeneous in motives
• Multi-billion dollar business– Source of new opportunities in marketing
• Strains the boundaries of trademark and intellectual property laws
• A source of unending technical challenges
4Yahoo! Research
The coarse-level dynamics
Content creators Content aggregators
Feeds
Crawls
Content consumers
Adv
ertis
emen
tEd
itoria
l
Subs
crip
tion
Tran
sact
ion
5Yahoo! Research
Brief (non-technical) history
• Early keyword-based engines– Altavista, Excite, Infoseek, Inktomi, Lycos,
ca. 1995-1997
• Paid placement ranking: Goto (morphed into Overture → Yahoo!)– Your search ranking depended on how much
you paid– Auction for keywords: casino was
expensive!
6Yahoo! Research
Brief (non-technical) history
• 1998+: Link-based ranking pioneered by Google– Blew away all early engines except
Inktomi– Great user experience in search of a
business model– Meanwhile Goto/Overture’s annual
revenues were nearing $1 billion
7Yahoo! Research
Brief (non-technical) history
• Result: Google added “paid-placement”ads to the side, separate from search results
• 2003: Yahoo follows suit, acquiring Overture (for paid placement) and Inktomi (for search)
11Yahoo! Research
The power of social media
• Flickr – community phenomenon• Millions of users share and tag each
others’ photographs (why???)• The wisdom of the crowd can be used
to search• The principle is not new – anchor text
used in “standard” search• Don’t try to pass the Turing test?
12Yahoo! Research
Anchor text
• When indexing a document D, include anchor text from links pointing to D.
www.ibm.com
Armonk, NY-based computergiant IBM announced today
Joe’s computer hardware linksCompaqHPIBM
Big Blue today announcedrecord profits for the quarter
13Yahoo! Research
Challenges in social search
• How do we use these tags for better search?
• How do you cope with spam?• What’s the ratings and reputation system?
• The bigger challenge: where else can you exploit the power of the people?
• What are the incentive mechanisms?– Luis von Ahn (CMU): The ESP Game
14Yahoo! Research
Ratings and reputation
• Node reputation: Given a DAG with– a subset of nodes called GOOD– another subset called BAD– Find a measure of goodness for all other
nodes.• Node pair reputation: Given a DAG with a
real-valued trust on the edges– Predict a real-valued trust for ordered node
pairs not joined by an edge
Metric labelling
17Yahoo! Research
Generic questions
• Of the various advertisers for a keyword, which one(s) get shown?
• What do they pay on a click through?• The answers turn out to draw on
insights from microeconomics
22Yahoo! Research
First-cut assumption
• Click-through rate depends only on the slot, not on the advertisement
• In fact not true; more on this later.
23Yahoo! Research
Advertiser’s value
• We assume that an advertiser j has a value vj per click through– Some measure of downstream profit
• Say, click-through followed by• 96% of the time, no purchase• 0.7% buy Dishwasher, profit $500• 1.2% buy Vacuum Cleaner, profit $200• 2.1% buy Cleaning agents, profit $1
$ 5.921
24Yahoo! Research
Example
• For the keyword miele, say an advertiser has a value of $10 per click.
• How much should he bid?• How much should he be charged?
The value of a slot for an advertiser,what he bids andwhat he is charged, may all be different.
25Yahoo! Research
Advertiser’s payoff in ad slot i
(Click-through rate) x (Value per click) –(Payment to search engine)= ri vj – (Payment to Engine)= ri vj – pij
Payment ofadvertiser j
in slot iFunction of all other bids.
26Yahoo! Research
Two auction pricing mechanisms
• First price: The winner of the auction is the highest bidder, and pays his bid.
• Second price: The winner is the highest bidder, but pays the second-highest bid.
• Engine decides and announces pricing.• What should an advertiser bid?
Not truthful.
27Yahoo! Research
Second-price = Vickrey auction
• Consider first a single advt slot• Winner pays the second-highest bid• Vickrey: Truth-telling is a dominant
strategy for each player (advertiser)– No incentive to “game” or fake bids
28Yahoo! Research
Auctions and pricing: multiple slots
• Overture’s (→Yahoo!’s) model:– Ads displayed in order of decreasing bid– E.g., if advertiser A bids 10, B bids 2, C bids
4 – order ACB• How do you price slots? Generalized Vickrey?
– Generalized second-price (GSP)– Vickrey-Clark-Groves (VCG): each
advertiser pays the externality he imposes on others
29Yahoo! Research
VCG pricing
• Suppose click rates are 200 in the top slot, 100 in the second slot
• VCG payment of the second player (C) is 2 x 100 = 200
• For the first player, 4x(200-100) + 200Externality on third player B.
Externality on C. Externality on B.
30Yahoo! Research
Bidder A, $10
Bidder C, $4
Bidder B, $2
Pays 4
Pays 2
Generalized Second Price auction pricing
31Yahoo! Research
VCG and GSP
• Truth-telling is a dominant strategy under VCG …
• Truth-telling not dominant under GSP!
Edelman, Ostrovsky, Schwarz
Aggarwal, Goel, Motwani (ACM EC 2006): give a truthful mechanism in a model that precludes VCG.
32Yahoo! Research
VCG and GSP
• Static equilibrium of GSP is locally envy-free: no advertiser can improve his payoff by exchanging bids with advertiser in slot above.
• Depending on the mechanism, revenue varies: GSP ≥ VCG.
Edelman, Ostrovsky, Schwarz
Locally envy-free mechanisms correspondto Stable Marriage solutions.
33Yahoo! Research
GSP for bid-ordering
• What’s good about bid-ordering and GSP?–Advertisers like transparency
• What’s wrong with bid-ordering?
36Yahoo! Research
Revenue ordering
• Simplified version of Google’s ordering– Each ad j has an expected click-
through denoted CTRj
– Advertiser j’s bid is denoted bj
• Then, expected revenue from this advertiser is Rj = bj+1 x CTRj
• Order advertisers by Rj
– Payment by GSP
39Yahoo! Research
Still primitive understanding
• Advertisers’ bids generally placed by robots–Currently approved by Engines–No room for coalitions
• Granularity of markets to bid on• Pricing when the number of ad slots is
variable
40Yahoo! Research
Burgeoning research area
• Marketplace design– Multi-billion dollar business, growing
fast– Interface of microeconomics and CS
• Many open problems, a few papers, some of them quite realistic
43Yahoo! Research
The power of the middleman
• Setting: you have a need– For information, for goods …
• You initiate a request for it and offer a reward for it, to some person X– Reward = your value U for the answer
• How much should X “skim off” from your offered reward, before propagating the request?
44Yahoo! Research
Propagation
U U – r1 U – r1 – r2 …
r1 r2
Request propagated repeatedly until it finds an answer.Target not known in advance.Middlemen get reward only if answer reached.
45Yahoo! Research
More generally
U
….
U – r1Each middleman decides how much to “skim off”.Middleman only gets paid if on the path to the answer.
$$
$
$
46Yahoo! Research
Rewards must be non-trivial
• We will assume that all the ri ≥1.• Else, have a form of Zeno’s paradox:
– Source can get away with offering an arbitrarily small reward.
• Equivalently, nodes value their effort in participating.
47Yahoo! Research
Back to the line
U U – r1 U – r1 – r2 …
r1 r2
Under strategic behavior by each player, how much should a player skim?
n = answer rarity: probability a node has the answer = 1/n, independently of other nodes.
48Yahoo! Research
The bad news
• For rarity n, it takes about n hops to get to the answer.
• Initial reward must be exponential in n–A very inefficient network.
For a constant failure probability.
49Yahoo! Research
Branching processes
• Branching process: a network where• Each node has a number of
descendants• Number of descendants is a random
variable X– drawn from a probability distribution– Expectation[X] = b
50Yahoo! Research
Branching processes
• Classical study of population dynamics and random graph evolution.
• Basic fact:–If b < 1, process dies out–If b ≥ 1, process infinite.
51Yahoo! Research
Main results - unique Nash
• For b<2, the initial investment must be exponential in the path length from the root to the answer.
• For b>2, the initial investment is linearin the path length from the root to the answer.
Criticality at b=2.Knowing fewer than 2 people is expensive.
52Yahoo! Research
Tempting conclusion
• (Sufficient) competition makes incentive networks efficient.
• But … we haven’t fully introduced competition yet.–On trees, we have a unique path
from the origin to each node.
53Yahoo! Research
Many open questions
• Full model of competition–When does competition
promote efficiency?
• Given a DAG, how does a node compute its strategy?
54Yahoo! Research
The net
• Web search is scientifically young• It is intellectually diverse
– The human element– The social element
• The science must capture economic, legal and sociological reality.