Upload
search-marketing-expo-smx
View
106.247
Download
6
Embed Size (px)
Citation preview
How Google WorksA Ranking Engineer’s PerspectivePaul HaahrSMX WestMarch 3, 2016
GoogleSearchToday
Mobile First
Features
• spelling suggestions
• autocomplete
• related searches
• related questions
• calculator
• knowledge graph
• answers
• featured snippets
• maps
• images
• videos
• in-depth articles
• movie showtimes
• sports scores
• weather
• flight status
• package tracking
• …
Ranking
10 Blue Links
What documents do we show?
What order do we show them in?
Lifeof aQuery
Two Parts of a Search Engine• Ahead of time (before the query)• Query processing
Before the Query• Crawl the web• Analyze the crawled pages
• Extract links• Render contents• Annotate semantics• …
• Build an index
The Index• Like the index of a book• For each word, a list of pages it appears on• Broken up into groups of millions of pages
• At Google, these are called “shards”• 1000s of shards for the web index
• Plus per-document metadata
Query Processing• Query understanding and expansion
• Retrieval and scoring
• Post-retrieval adjustments
Query Understanding• Does the query name any known entities?
• [san jose convention center]• [matt cutts]
• Are there useful synonyms?• [gm trucks]: “gm” → “general motors”• [gm corn]: “gm” → “genetically modified”
• Context matters
Retrieval and Scoring• Send the query to all the shards• Each shard
• Finds matching pages• Computes a score for query+page• Sends back the top N pages by score
• Combine all the top pages• Sort by score
Post-retrieval adjustments• Host clustering, sitelinks• Is there too much duplication?• Spam demotions, manual actions• …
What do ranking engineers do? (version 1)
Write code for those servers
ScoringSignals
Signal• A piece of information used in scoring• Query independent – feature of page
• PageRank, language, mobile friendliness, ...
• Query dependent – feature of page & query• keyword hits, synonyms, proximity, …
What do ranking engineers do? (version 2)
Look for new signals.
Combine old signals in new ways.
Metrics
“If you can not measure it, you can not improve it.”
–Lord Kelvin (sort of)
Key Metrics• Relevance
• Does a page usefully answer the user’s query?• Ranking’s top-line metric
• Quality• How good are the results we show?
• Time to result (faster is better)• ...
Higher results matter• “Position weighed”• “Reciprocally ranked” metrics
• Position 1 is worth 1• Position 2 is worth ½• Position 3 is worth ⅓• Position 4 is worth ¼• …
What do ranking engineers do? (version 3)
Optimize for our metrics
But where do themetrics come from?
Evaluation
How do we measure ourselves?• Live Experiments• Human Rater Experiments
LiveExperiments
Live Experiments• A/B experiments on real traffic
• Similar to what many other websites do
• Look for changes in click patterns• Harder to understand than you might expect
• A lot of traffic is in one experiment or another
Interpreting Live Experiments• Both pages P1 and P2 answer user’s need• For P1, answer is on the page• For P2, answer is on the page and in the snippet• Algorithm A puts P1 before P2 user clicks on P⇒ 1 “good”⇒• Algorithm B puts P2 before P1 no click “bad”⇒ ⇒
• Do we really think A is better than B?
HumanRaterExperiments
Human Rater Experiments• Show real people experimental search results• Ask how good the results are• Ratings aggregated across raters• Published guidelines explain criteria for raters• Tools support doing this in an automated way
Result Rating Task
Two Scales• Needs Met
• Does this page address the user’s need?• Our current relevance metric
• Page Quality• How good is the page?
MobileFirst
Mobile First Rating
“Needs Met rating tasks ask [raters] to focus on mobile user needs and think
about how helpful and satisfying the result is for the mobile users.”
How do we make it mobile-centric?• More mobile queries than desktop in samples• Pay attention to user’s location• Tools display mobile user experience• Raters visit websites on smartphones
NeedsMetRating
Needs Met Rating• Fully Meets• Highly Meets• Moderately Meets• Slightly Meets• Fails to Meets
(Following examples are from Rater Guidelines)
FullyMeets
(Very)HighlyMeets
HighlyMeets
(More)HighlyMeets
ModeratelyMeets
SlightlyMeets
Fails toMeet
PageQualityRating
Page Quality Concepts• Expertise• Authoritativeness• Trustworthiness
High Quality Pages• A satisfying amount of high quality main content
• The page and website are expert, authoritative, and trustworthy for the topic of the page
• The website has a good reputation for the topic of the page
Low Quality Pages• The quality of the main content is low
• There is an unsatisfying amount of main content
• The author does not have expertise or is not trustworthy or authoritative for the topic
• The website has a negative reputation
• The secondary content is distracting or unhelpful
OptimizingOurMetrics
Ranking engineers• Team of a few hundred computer scientists• Focused on our metrics and signals• Run lots of experiments• Make lots of changes
Development Process• Idea• Repeat until ready:
• Write code• Generate data• Run experiments• Analyze
• Launch report by Quantitative Analyst• Launch review
What do ranking engineers do? (version 4)
Move results with good ratings up.
Move results with bad ratings down.
WhatGoesWrong?
(And how do we fix it?)
Two kinds of problems• Systematically bad ratings• Metrics don’t capture things we care about
BadRatings
[texas farm fertilizer]• User is looking for a
brand of fertilizer
• Unlikely to want to go to the manufacturer’s headquarters
• Rater average called map of headquarters almost “Highly Meets”
Patterns of Losses• Look for things we think are bad in results
• Either live or from experiments
• Create examples for rater guidelines
New rater example
MissingMetrics
Low Quality Content in 2009-2011• Lots of complaints about low quality content• But our relevance metric kept going up
• Low quality pages can be very relevant• We thought we were doing great
• ⇒ We weren’t measuring what we needed to
Quality Metric• Gets directly at the quality issue• Not the same as relevance• Enabled development of quality-related signals
When theMetricsMissSomething
What do ranking engineers do? (version 5)
Fix rater guidelines ordevelop new metrics
(when necessary)
Thank you!
Questions?