Upload
lucidworks
View
77
Download
0
Embed Size (px)
Citation preview
3 3
About Rakuten
• Founded in 1997 in Japan
• Operates Rakuten Ichiba, the largest e-‐commerce site in Japan
• One of the 15 largest internet companies in the world
• 10,000+ employees worldwide
• $6.3 billion in revenue in FY2015
6 6
Solr at Rakuten • 30+ Services within the Rakuten group using Solr
• Solr supported in 10+ languages
• At Rakuten.com
• Supported via Solr
• Over 30 million products and 90 million different items
• Thousands of unique categories and attributes to search against
• Millions of queries a day!
7 7
Overview
Introduc1on To Facets
Built-‐In Facet Sor1ng Methods
Relevancy-‐Based Facet Sor1ng Methods
9 9
Facet Sorting Criteria
9
• Top facets are relevant to query
• Top facet order reflects relevancy
• Easy to maintain over Mme
• Acceptable latency in producMon
10 10
Latency Impact Of Facets
10
• Facets are expensive • Extra logic can be performance hit
• In some cases, facets can slow down queries by 10x
• OOMs in extreme cases
11 11
Overview
Introduc1on To Facets
Built-‐In Facet Sor1ng Methods
Relevancy-‐Based Facet Sor1ng Methods
12
Brands
ORerbox
Incipio
Apple
AAA Phone Cases
Assume we have the following brand facets for the query iPhone 6 Cases
Example - iPhone 6 Cases
AAA
13 13
Search Results for iPhone 6 Cases
13
1 2 3 4 5
O>erbox iPhone Case Brand: ORerbox
O>erbox iPhone Case Brand: ORerbox
Generic iPhone Case Brand: Incipio
Generic iPhone Case Brand: Incipio
Generic iPhone Case Brand: Incipio
6 7 8 9 10
iPhone 6s + Case Brand: Apple
Off-‐Brand iPhone Case Brand: AAA Phone
Cases
Off-‐Brand iPhone Case Brand: AAA Phone
Cases
Off-‐Brand iPhone Case Brand: AAA Phone
Cases
Off-‐Brand iPhone Case Brand: AAA Phone
Cases
AAA AAA AAA AAA
14 14
Default Facet Sorting Methods
14
Sort based on alphabeMcal order of facet values
Name Sort Count Sort
Sort based on result count per facet value
Let’s see how they do
15
15
Name Sort Count Sort
15
Brands
AAA Phone Cases
Apple
Incipio
ORerbox
Brands
AAA Phone Cases Count: 4
Incipio Count: 3
ORerbox Count: 2
Apple Count: 1
AAA AAA
16 16
JSON Facets
16
• We can sort on a value associated with a facet
• Values must be wriRen to an indexed field
• Let’s add a staMc score to the mix and sort on that!
17
17
Search results for iPhone 6 Case
17
1 2 3 4 5
O>erbox iPhone Case Brand: ORerbox
Score: 30
O>erbox iPhone Case Brand: ORerbox
Score: 30
Generic iPhone Case Brand: Incipio
Score: 20
Generic iPhone Case Brand: Incipio
Score: 20
Generic iPhone Case Brand: Incipio
Score: 20
6 7 8 9 10
iPhone 6s + Case Brand: Apple
Score: 100
Off-‐Brand iPhone Case Brand: AAA Phone
Cases Score: 1
Off-‐Brand iPhone Case Brand: AAA Phone
Cases Score: 1
Off-‐Brand iPhone Case Brand: AAA Phone
Cases Score: 1
Off-‐Brand iPhone Case Brand: AAA Phone
Cases Score: 1
AAA AAA AAA AAA
18
18
Static Score Sort
18
Brands
Apple Score: 100
ORerbox Score: 30
Incipio Score: 20
AAA Phone Cases Score: 1 AAA
19
Name Count Sta1c Score
19
Results – Built-In Sorting Methods
19
• Top facets are relevant to query
• Top facet order reflects relevancy
• Easy to maintain over Mme
• Acceptable latency in producMon
20 20
Overview
Introduc1on To Facets
Built-‐In Facet Sor1ng Methods
Relevancy-‐Based Facet Sor1ng Methods
21 21
Score Sort
21
• Try sorMng on score: • msg: "undefined field: "score"”,
org.apache.solr.common.SolrException: undefined field: "score" at
org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1231)
• Not supported out of box
• How could we add support for this?
22 22
Custom Collector Logic
22
• Could be implemented via a custom collector
• Would alter select facets
• Would require extra effort when performing Solr upgrades
• Could have a negaMve performance impact
• Might need addiMonal logic to support grouping/collapsing
23 23
API Wrapper
23
• Run an API wrapper around Solr • Re-‐sort facets in wrapper • Easy to add custom business rules
25 25
Blended Approach
25
• Use both result scores and user data
• Use machine learning to blend the scores together
ORerbox 30 User Clicks
Incipio 50 User Clicks
Apple 10 User Clicks
AAA Phone Cases 1 User Click AAA
28 28
Impact of API Wrapper
28
• Coverage of significant user queries • Can be used with grouping • Most calculaMons are done offline
• No major impact on search latency
• 99% response Mme impact of less than 5 ms
29
Score – Custom Collector
Score – API
Wrapper Blended
29
Results – Relevancy-Based Sorting Methods
29
• Top facets are relevant to query
• Top facet order reflects relevancy
• Easy to maintain over Mme
• Acceptable latency in producMon
31
Conclusions
• Built-‐in facet sorMng methods are not always opMmal for relevancy
• SorMng facets based on result score can improve relevancy
• IntegraMng external signals (such as user data) makes the soluMon more robust
32 32
We’re Hiring!
32
• Search Hackers
• Data ScienMsts
• NLP Gurus • Machine Learning Hobbyists
• Deep Learning Knights
• Apache CommiRers
Please visit rakuten.careers