Upload
lucenerevolution
View
677
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Presented by John Marc Imbrescia, Senior Software Engineer, Etsy.com Etsy recently chose to bring our location services in house. We used the open source GeoNames data set and built the tools we needed to use that data to allow members to select their location, show translations of place names, and to feed data into our search database for local, regional, and country based searches. This talk will cover the implementation details and decisions we made along the way. How we mapped places from our old data set to the GeoNames data. The internal tools we built including a SOLR core for doing location place name autosuggest. Modifications to our Listings Search and Shop Search cores and the different ways we use location based search around the site both distance and region based using GeoNames hierarchy data. There will also be a discussion about choosing to release some of the tools we built for this project open source and the decisions behind the non-search (display etc.) related elements of the project and the tools we chose for them and why.
Citation preview
Internalizing loca.on services with GeoNames
John Marc ImbresciaSenior Software Engineer - Etsy
Wednesday, May 1, 13
Internalizing location services with GeoNames
May 2 2013
Wednesday, May 1, 13
The world’s online handmade marketplace.What is Etsy.com?
Wednesday, May 1, 13
What is Etsy.com?•20 million unique items•18 million daily item searches•800,000 sellers•28 million unique views per month•Developer blog: codeascraft.etsy.com•450 worldwide employees
Wednesday, May 1, 13
Our ProblemLocation names were only in English
•Search based on English names•Display and search needed to be i18n friendly.•API limits and speed concerns meant we needed a new solution.
Wednesday, May 1, 13
What do we use Location for?More than just search
•Display•Local Search
•No Mapping•No Bounding boxes
Wednesday, May 1, 13
What do we use Location for?Item Search
Wednesday, May 1, 13
What do we use Location for?Item Search
Wednesday, May 1, 13
What do we use Location for?Item Search
Wednesday, May 1, 13
What do we use Location for?Location Display
Wednesday, May 1, 13
How did this use to work?•Yahoo API•Every lookup was an API call•Stored user input and API response•Searched based on text match of API response•Not radius using lat/lon•No way to Internationalize
Wednesday, May 1, 13
What Services did Etsy need to Internalize location services?
•Lookup - Autosuggest•Update - Scripts to refresh data•Display - Built into the php stack•Search - Existing, modified for new pattern
Wednesday, May 1, 13
What we have now•GeoNames as a data source•Feeds “geonamessuggest” Solr Core•Sqlite database for place name lookup•GeoName IDs used for local search•Leverages GeoName hierarchy data•Built in Internationalization
Wednesday, May 1, 13
How did we get here?•Mapped old locations to GeoNames•Added Geoname ID hierarchy to listing search•Pushed out Sqlite database to webservers•Slowly transitioned lookup and search services•Did side by side testing to look for anomalies
Wednesday, May 1, 13
What are the data types?GeoNames
Wednesday, May 1, 13
SchemasGeoNames
•775k Entries•1.4m alternate spellings
Wednesday, May 1, 13
SchemasGeoNames
Wednesday, May 1, 13
GeonamessuggestOur autosuggest for place names
•Localized•GeoIP•Population
Wednesday, May 1, 13
GeonamessuggestSchema
Wednesday, May 1, 13
Distance and population come firstSort function
Wednesday, May 1, 13
GeonameId HierarachyLocal listing search
•Each listing gets a hierarchy of geonameids•Local search is a filter on this ID•Fast & Reliable•Enables powerful functionality•Kept old data fields
Wednesday, May 1, 13
GeonameId CollectionLocal listing search
•Each listing gets a hierarchy of geonameids•Local search is a filter query on this ID•Fast & Reliable•Enables powerful functionality•Kept old data fields
Wednesday, May 1, 13
CONFERENCE PARTYThe Tipsy Crow: 770 5th AveStarts after Stump The ChumpYour conference badge gets you in the door
TOMORROW Breakfast starts at 7:30Keynotes start at 8:30
CONTACT John Marc Imbrescia@[email protected]
Wednesday, May 1, 13