DynamoDB and Amazon Cloudsearch

Preview:

DESCRIPTION

This talk covers how Amazon CloudSearch and Amazon DynamoDB can be used together to provide an ideal combination of throughput and durability and rich, powerful search.

Citation preview

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Amazon CloudSearch Meetup San Francisco

March 28, 2013

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Welcome !   Housekeeping

!   Slides will be posted !   Upcoming Events

•  April 3, 2013. Webinar 10:00am – 11:00am. "Building Location-Based Search" https://www2.gotomeeting.com/register/264455602

•  April 30, 2013. AWS Summit, San Francisco. http://aws.amazon.com/aws-summit-2013/san-francisco/

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Agenda 6:30-7:30 Networking (over Pizza/Beer/Drinks)

7:00-7:30 Using CloudSearch with DynamoDB – Jon Handler, Amazon CloudSearch Solution Architect

7:30-8:00 EDU 2.0 User Talk - Graham Glass, Founder

8:00-8:30 Getting More from Your Search Queries: How to Tune Search Requests – Tom Hill, Amazon CloudSearch Solution Architect

8:30-8:45 SnapGuide User Talk - Sam Kimbrel, Senior Software Engineer

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Using CloudSearch with DynamoDB

Jon Handler, Amazon CloudSearch Solution Architect

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Do You Want Search With That?

!   Amazon Dynamo DB •  High throughput, provisioned latency •  NoSQL model

!   Amazon CloudSearch •  Rich query API – free text, fielded, faceting and more •  User-customizable ranking

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Amazon CloudSearch Overview DNS / Load Balancing AWS Query

Search API Console Config API

Command Line Tools

Console Doc Svc API

Command Line Tools

Console

SEARCH SERVICE DOCUMENT SERVICE CONFIG SERVICE

Search Domain

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Automatic Scaling

SEARCH INSTANCE Index Partition n

Copy 1

SEARCH INSTANCE Index Partition 2

Copy 2

SEARCH INSTANCE Index Partition n

Copy 2

SEARCH INSTANCE Index Partition 2

Copy n

SEARCH INSTANCE

DATA Document Quantity and Size

TRAFFIC Search Request Volume and Complexity

Index Partition n Copy n

SEARCH INSTANCE Index Partition 1

Copy 1

SEARCH INSTANCE Index Partition 2

Copy 1

SEARCH INSTANCE Index Partition 1

Copy 2

SEARCH INSTANCE Index Partition 1

Copy n

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

The Big Picture

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Working With DynamoDB and CloudSearch

!   Create an Amazon CloudSearch domain !   Upload data from DynamoDB to CloudSearch !   Configure: item attributes map to index fields !   Keep CloudSearch and DynamoDB in sync

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Create An Amazon CloudSearch Domain

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Text fields for matching user terms

Result enabled to retrieve source data

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Literal fields for Faceting

Facet enabled to retrieve facet counts Search enabled for

narrowing

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Integer fields for ranking, narrowing

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Configure the Domain

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Amazon Dynamo DB Data Primary Key Attributes

id = "tt0076759" title = "Star Wars", director = "Lucas, George", genre = {"Action","Adventure","Fantasy","Sci-Fi"}, actor = {"Ford, Harrison","Fisher, Carrie","Hamill, Mark”, "Jones, James Earl","Guinness, Alec","Johnston, Joe", "Mayhew, Peter","Cushing, Peter","Prowse, David","Daniels, Anthony”} year=1977

id = "tt1411664" title = "Born to Be a Star", director = "Brady, Tom", genre = {"Comedy"}, actor = {"Ricci, Christina","Swardson, Nick","Dorff, Stephen”, "Johnson, Don","Bain, Robin","Herrmann, Edward","Goodman, Dana”, "Giangrande, Meredith","Dawn, Nadia","Locke, Tembi","Herschman, Adam"} year=2011

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

CloudSearch SDF [{"type":"add", ! "id": "tt0076759",! "version": 1,! "lang": "en", ! "fields": {! "title":"Star Wars",! "director":"Lucas, George",! "genre": ["Action","Adventure","Fantasy","Sci-Fi"], ! "actor": ["Ford, Harrison","Fisher, Carrie",! "Hamill, Mark","Jones, James Earl",! "Guinness, Alec”, ...],! "year": 1977!} }, ...]!

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Uploading

!   CloudSearch's batch update strategy !   Table scan one row is one document !   Batch multiple rows together and send

!   Alternate strategy using Hive/S3/CLTs

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Sending to CloudSearch: Main Loop def _dynamo_to_cloudsearch(table, domain):! doc_service = domain.get_document_service()! for table_item in table.scan():! docid = table_item['docid']! version = _cur_timestamp()! doc_service.add(docid, version,! _get_fields(table_item))! if len(doc_service.get_sdf()) >= FOUR_MEGABYTES: ! resp = doc_service.commit()!

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Sending To CloudSearch: Fields

def _get_fields(table_item):! '''Return a dict with key:value for all! of the fields in the table'''! attrs = {} ! for key,value in table_item.iteritems(): ! attrs[key] = ! [item for item in value]! return attrs !!

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Limits

!   ID sizing •  Dynamo: primary key 2K, range key 1K •  CloudSearch: document id 128 bytes

!   Items •  Dynamo: unlimited attributes, up to 64K of data •  CloudSearch: unlimited data, up to 200 fields and 100 field

values •  Attribute/field naming conventions

! Numerics

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Keeping CloudSearch Up-To-Date

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Conclusion

!   Combine Amazon CloudSearch and Amazon DynamoDB •  DynamoDB – throughput, durability •  CloudSearch – Rich, powerful search

!   Just a few lines of code

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Resources !   Amazon CloudSearch Overview Page

http://aws.amazon.com/cloudsearch/ •  FAQs •  Community Forum •  Documentation & Getting Started Tutorial (IMDb)

!   Demos and Tutorials •  What Is Amazon CloudSearch •  Introducing Amazon CloudSearch (Features) •  Building a Search Application Using Amazon CloudSearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Q&A

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

We're Hiring!

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Thank You

Jon Handler / handler@amazon.com

Recommended