View
1.940
Download
3
Category
Preview:
DESCRIPTION
See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011 In this talk we would like to present three different use cases of Solr in the travel industry. First of all we would describe how we implemented faceted navigation for hotel shopping. Then, we will introduce how we implemented destination searching functionality like auto-complete and misspelling. Lastly, we will show you how we integrated Solr to provide better experiences to mobile users.
Citation preview
Using Solr in Online Travel to Improve User Experience
Sudhakar Karegowdra, Esteban Donato
Travelocity, May 25TH 2011{ sudhakar.karegowdra, esteban.donato}@travelocity.com
What We Will Cover Travelocity Speakers Background Merchandising & Solr
• Challenges• Solution• Sizing and performance data• Take Away
Location Resolution & Solr• Challenges• Solution• Sizing and performance data• Take Away
Q&A2
First Online Travel Agency(OTA) Launched in 1996 Grown to 3,000 employees and is one of the largest
travel agencies worldwide Headquartered in Dallas/Fort Worth with satellite
offices in San Francisco, New York, London, Singapore, Bangalore, Buenos Aires to name a few
In 2004, the Roaming Gnome became the centerpiece of marketing efforts and has become an international pop icon
Owned by Sabre Holdings - sister companies include Travelocity Business, IgoUgo.com, lastminute.com, Zuji among others
3
Speakers Background
Esteban Donato• Lead Architect
Travelocity.com My experience
– 10 + years– Solr 2 years – Analyzing Mahout and
Carrot2 for document clustering engine.
Topic : Location Resolution
4
Sudhakar Karegowdra• Principal Architect
Travelocity.com My experience
– 13 + years– Solr/ Lucene 3 years – Implementing Hadoop,
Pig and Hive for Data warehouse.
Topic : Merchandising
5
Merchandising By Sudhakar Karegowdra
The Challenge
Market Drivers• Build Landing Pages with Faceted Navigation• Enable Content Segmentation and delivery• Support Roll out of Promotions • Roll up Data to a higher level
E.g., All 5 star hotels in California to bring all the 5 Star hotels from SFO,LAX, SAN etc.,
• Faster time to market new Ideas• Rapidly scale to accommodate global brands
with disparate data sources
6
The Challenge
Traditional Database approach• Higher time to market• Specialized skill set to design and optimize
database structures and queries• Aggregation of data and changing of structures
quite complex• Building Faceted navigation capabilities needs
complex logic leading to high maintenance cost
7
Solution - Overview
Data from various sources aggregated and ingested into Solr • Core per Locale and Product Type
Wrapper service to combine some data across product cores and manage configuration rules
Solr’s built in Search and Faceting to power the navigation
8
Solution – Architecture View
9
Solr Master (Multi Core)
Oracle
Offer Management
ToolETL
Services/Business Logic
UI Widgets Mobile
Deals Products ……
Solr Slaves (Multi Core)
Solution - Achievements
Millions of unique Long Tail Landing Pages E.g.,
http://www.travelocity.com/hotel-d4980-nevada-las-vegas-hotels_5-star_business-center_green
Faster search across products E.g., Beach Deals under $500
Segmented Content delivery through tagging Scaled well to distribute the content to different
brands, partners and advertisers Opened up for other innovative applications
Deals on Map, Deals on Mobile, Wizards etc.,
10
Solution – Road Ahead
Migration to Solr 3.1 • Geo spatial search• CSV out put format
Query boosting by Search pattern Near Real time Updates Deal and user behavior mining in Hadoop –
MapReduce and Solr to Serve the Content Move Slaves to Cloud
11
Sizing & Performance
Index Stats Number of Cores : 25 Number of Documents : ~ 1 Million Records
Response Requests : 70 tps Average response time : 0.005 seconds (5 ms)
Software Versions Solr Version 1.4.0
– filterCache size : 30000
Tomcat – 5.5.9 JDK1.6
12
Take Away
Semi Structured Storage in Solr helps aggregate disparate sources easily
Remember Dynamic fields
Multiple Cores to manage multiple locale data
Solr is a great enabler of “Innovations”
13
14
Location ResolutionBy Esteban Donato
The Challenge
How to develop a global location resolution service?
Flexibility to changes General enough to cover everyone needs Multi language Performance and scalability Configurable by site
15
Architecture of the solution
16
Location DB
Solr Master
Solr Slave
Management Tool
Auto-completeResolution
Batch Job
Remote Streaming indexing CSV format
Master/Slave architecture Multi-core: each core
represents a language
SolrJ client binary format Solr response cache
Auto-complete
System has to suggest options as the users type their desired location
Examples “san” => San Francisco, “veg” => Las Vegas
Relevancy: not all the locations are equally important. “par” => “Paris, France”; “Parana, Argentina”
Users can search by various fields: location code, location name, city code, city name, state/province code, state province name, country code, country name.
17
Solr schema<dynamicField name="RANK*" type="int" required="false" indexed="true" stored="true" />
<field name="GLS_FULL_SEARCH" type="glsSearchField" required="false" indexed="true" stored="false" multiValued="true" />
<fieldType name="glsSearchField" class="solr.TextField" positionIncrementGap="100“>
<analyzer>
<tokenizer class="solr.PatternTokenizerFactory" pattern="[/\-\t ]+" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.TrimFilterFactory" />
<filter class="solr.ISOLatin1AccentFilterFactory" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
<filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/>
</analyzer>
</fieldType>
18
Resolution
System has to resolve the location requested by the users.
Contemplates aliases. Big Apple => New York Contemplates ambiguities. Contemplates misspellings. Lomdon => London
NGramDistance algorithm. How to combine distance with relevancy Error suggesting the correct location when it is a prefix.
Lond => London
19
Spellchecker configuration<fieldType name=" spellcheckType " class="solr.TextField" positionIncrementGap="100“>
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory” />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.TrimFilterFactory" />
<filter class="solr.ISOLatin1AccentFilterFactory" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
<filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/>
</analyzer>
</fieldType>
20
Sizing & Performance
4 cores with ~ 500,000 documents indexed each
Response times• Auto-complete: 15ms, 20 TPS• Resolution: 10ms, 2 TPS
Cache configuration• queryResultCache: maxSize=1024• documentCache, maxSize=1024• fieldValueCache & filterCache disabled
21
Wrap Up
Performance always as top priority Develop simple but robust services Provide a simple API
22
Q&A
23
Contact
Esteban Donato• Esteban.donato@travelocity.com• Twitter: @eddonato
Sudhakar Karegowdra• Sudhakar.karegowdra@travelocity.com• Twitter: @skaregowdra
https://www.facebook.com/travelocity
Twitter: @travelocity and
@RoamingGnome
24
Recommended