SVC101 Building Search into Your App - AWS re: Invent 2012

Preview:

DESCRIPTION

Amazon CloudSearch is a fully-managed search service in the cloud that allows customers to easily integrate fast and highly scalable search functionality into their applications. In this session, we cover the basics of search and search engines. We take an introductory look at CloudSearch along with a deep dive showing how to build a CloudSearch-based web application.

Citation preview

Search experience = user retention and revenue

DNS / Load Balancing AWS Query

Search API Console Config

API

Command

Line Tools Console Doc

Svc API

Command

Line Tools Console

SEARCH SERVICE Search Documents

DOCUMENT SERVICE Add Documents

Update Documents

Delete Documents

Create Domains

Configure Domains

Delete Domains

CONFIG SERVICE

ACCESS CONTROL ACCESS CONTROL ACCESS CONTROL

Search Domain

SEARCH INSTANCE Index Partition n

Copy 1

SEARCH INSTANCE Index Partition 2

Copy 2

SEARCH INSTANCE Index Partition n

Copy 2

SEARCH INSTANCE Index Partition 2

Copy n

SEARCH INSTANCE

DATA Document Quantity and Size

TRAFFIC Search Request Volume and Complexity

Index Partition n Copy n

SEARCH INSTANCE Index Partition 1

Copy 1

SEARCH INSTANCE Index Partition 2

Copy 1

SEARCH INSTANCE Index Partition 1

Copy 2

SEARCH INSTANCE Index Partition 1

Copy n

• The Challenge

• The Data: The Million Song Data Set

http://labrosa.ee.columbia.edu/millionsong/

• The Application

Field name Description

artist_mbid The musicbrainz.org ID

artist_name Name of the artist

audio_md5 Hash code of the audio

danceability According to The Echo Nest

duration In seconds

loudness General loudness of the track

song_hottnesss According to Echo Nest

title Song title

year Song year

• Create an Amazon CloudSearch domain

• Identify use case and supporting data

• Upload data

• Configure the domain

• Improve document ranking

• Integrate with the front end

• Keep documents up-to-date

• Create an Amazon CloudSearch domain

• Identify use case and supporting data

• Upload data

• Configure the domain

• Improve document ranking

• Integrate with the front end

• Keep documents up-to-date

• Create an Amazon CloudSearch domain

• Identify use case and supporting data

• Upload data

• Configure the domain

• Improve document ranking

• Integrate with the front end

• Keep documents up-to-date

Artist Name

Song Title

Familiarity

Year

Genre

Artist

Year

Title

Artist Name

Genre

Artist Familiarity

Year

• Create an Amazon CloudSearch domain

• Identify use case and supporting data

• Prepare and upload data

• Configure the domain

• Improve document ranking

• Integrate with the front end

• Keep documents up-to-date

Million Song

DataSet

SDF

Batches

Amazon

CloudSearch

SDF Batches [

{"type":"add",

"id": "soaczam12ab0181559",

"version":5,

"lang":"en",

"fields": {

"title":"Ruby Tuesday",

"artist_name":"The Rolling Stones",

"year":"1967",

"artist_familiarity":864830,

"genre":["alternative", "ambient", "dance",

"electronic", "pop", "r&b", "reggae"]

}

},

… ]

• Create an Amazon CloudSearch domain

• Identify use case and supporting data

• Prepare and upload data

• Configure the domain

• Improve document ranking

• Integrate with the front end

• Keep documents up-to-date

Text fields for

matching user terms

Result enabled to

retrieve source data

Literal fields for

Faceting

Facet enabled to

retrieve facet counts

Search enabled for

narrowing

Integer fields for

ranking, narrowing

• Create an Amazon CloudSearch domain

• Identify use case and supporting data

• Prepare and upload data

• Configure the domain

• Improve document ranking

• Integrate with the front end

• Keep documents up-to-date

• Create an Amazon CloudSearch domain

• Identify use case and supporting data

• Prepare and upload data

• Configure the domain

• Improve document ranking

• Integrate with the front end

• Keep documents up-to-date

PHP Integration $results =

file_get_contents(

http://search-mn-songs-5bbplyghbb5tk257rsb7iamlsy." .

"us-east-1.cloudsearch.amazonaws.com" .

"/2011-02-01/search?q=" . $keyword .

"&return-fields=title,artist_name,year&" .

"facet=artist_name,year_facet,genre&" .

"rank=-" . $rank);

$resultsObj = json_decode($results);

Simple Search Result

{"rank": "-text_relevance",

"match-expr": "(label 'rolling stone')",

"hits": { "found": 204, "start": 0,

"hit": [ { "id": "sontsst12cf5f88b42" },

{ "id": "sopvopr12ab017f082" },

{ "id": "sorzrpw12ac468a13b" },

] },

...

}

Search Results With Return Values

"hit":

[ { "id": "sontsst12cf5f88b42",

"data": {

"artist_familiarity": [ "925048" ],

"artist_name": [ "The Rolling Stones" ],

"text_relevance": [ "326" ],

"title": [ "Heart Of Stone" ],

"year": [ "1964" ]

}

},

Facets In Search Results

{…"hits": { … },

"facets": {

"genre": {

"constraints": [

{ "value": "pop", "count": 126 },

{ "value": "rock", "count": 125 },

{ "value": "alternative", "count": 109 },

{ "value": "electronic", "count": 106 },

{ "value": "jazz", "count": 58 }, ...

] } }

X

X

• Create an Amazon CloudSearch domain

• Identify use case and supporting data

• Prepare and upload data

• Configure the domain

• Improve document ranking

• Integrate with the front end

• Keep documents up-to-date

26ms

Get Started Now, Free Trial

We are sincerely eager to

hear your feedback on this

presentation and on re:Invent.

Please fill out an evaluation

form when you have a

chance.

Recommended