27
© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc. Amazon CloudSearch Meetup San Francisco March 28, 2013

DynamoDB and Amazon Cloudsearch

Embed Size (px)

DESCRIPTION

This talk covers how Amazon CloudSearch and Amazon DynamoDB can be used together to provide an ideal combination of throughput and durability and rich, powerful search.

Citation preview

Page 1: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Amazon CloudSearch Meetup San Francisco

March 28, 2013

Page 2: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Welcome !   Housekeeping

!   Slides will be posted !   Upcoming Events

•  April 3, 2013. Webinar 10:00am – 11:00am. "Building Location-Based Search" https://www2.gotomeeting.com/register/264455602

•  April 30, 2013. AWS Summit, San Francisco. http://aws.amazon.com/aws-summit-2013/san-francisco/

Page 3: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Agenda 6:30-7:30 Networking (over Pizza/Beer/Drinks)

7:00-7:30 Using CloudSearch with DynamoDB – Jon Handler, Amazon CloudSearch Solution Architect

7:30-8:00 EDU 2.0 User Talk - Graham Glass, Founder

8:00-8:30 Getting More from Your Search Queries: How to Tune Search Requests – Tom Hill, Amazon CloudSearch Solution Architect

8:30-8:45 SnapGuide User Talk - Sam Kimbrel, Senior Software Engineer

Page 4: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Using CloudSearch with DynamoDB

Jon Handler, Amazon CloudSearch Solution Architect

Page 5: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Do You Want Search With That?

!   Amazon Dynamo DB •  High throughput, provisioned latency •  NoSQL model

!   Amazon CloudSearch •  Rich query API – free text, fielded, faceting and more •  User-customizable ranking

Page 6: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Amazon CloudSearch Overview DNS / Load Balancing AWS Query

Search API Console Config API

Command Line Tools

Console Doc Svc API

Command Line Tools

Console

SEARCH SERVICE DOCUMENT SERVICE CONFIG SERVICE

Search Domain

Page 7: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Automatic Scaling

SEARCH INSTANCE Index Partition n

Copy 1

SEARCH INSTANCE Index Partition 2

Copy 2

SEARCH INSTANCE Index Partition n

Copy 2

SEARCH INSTANCE Index Partition 2

Copy n

SEARCH INSTANCE

DATA Document Quantity and Size

TRAFFIC Search Request Volume and Complexity

Index Partition n Copy n

SEARCH INSTANCE Index Partition 1

Copy 1

SEARCH INSTANCE Index Partition 2

Copy 1

SEARCH INSTANCE Index Partition 1

Copy 2

SEARCH INSTANCE Index Partition 1

Copy n

Page 8: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

The Big Picture

Page 9: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Working With DynamoDB and CloudSearch

!   Create an Amazon CloudSearch domain !   Upload data from DynamoDB to CloudSearch !   Configure: item attributes map to index fields !   Keep CloudSearch and DynamoDB in sync

Page 10: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Create An Amazon CloudSearch Domain

Page 11: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Page 12: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Text fields for matching user terms

Result enabled to retrieve source data

Page 13: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Literal fields for Faceting

Facet enabled to retrieve facet counts Search enabled for

narrowing

Page 14: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Integer fields for ranking, narrowing

Page 15: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Configure the Domain

Page 16: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Amazon Dynamo DB Data Primary Key Attributes

id = "tt0076759" title = "Star Wars", director = "Lucas, George", genre = {"Action","Adventure","Fantasy","Sci-Fi"}, actor = {"Ford, Harrison","Fisher, Carrie","Hamill, Mark”, "Jones, James Earl","Guinness, Alec","Johnston, Joe", "Mayhew, Peter","Cushing, Peter","Prowse, David","Daniels, Anthony”} year=1977

id = "tt1411664" title = "Born to Be a Star", director = "Brady, Tom", genre = {"Comedy"}, actor = {"Ricci, Christina","Swardson, Nick","Dorff, Stephen”, "Johnson, Don","Bain, Robin","Herrmann, Edward","Goodman, Dana”, "Giangrande, Meredith","Dawn, Nadia","Locke, Tembi","Herschman, Adam"} year=2011

Page 17: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

CloudSearch SDF [{"type":"add", ! "id": "tt0076759",! "version": 1,! "lang": "en", ! "fields": {! "title":"Star Wars",! "director":"Lucas, George",! "genre": ["Action","Adventure","Fantasy","Sci-Fi"], ! "actor": ["Ford, Harrison","Fisher, Carrie",! "Hamill, Mark","Jones, James Earl",! "Guinness, Alec”, ...],! "year": 1977!} }, ...]!

Page 18: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Uploading

!   CloudSearch's batch update strategy !   Table scan one row is one document !   Batch multiple rows together and send

!   Alternate strategy using Hive/S3/CLTs

Page 19: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Sending to CloudSearch: Main Loop def _dynamo_to_cloudsearch(table, domain):! doc_service = domain.get_document_service()! for table_item in table.scan():! docid = table_item['docid']! version = _cur_timestamp()! doc_service.add(docid, version,! _get_fields(table_item))! if len(doc_service.get_sdf()) >= FOUR_MEGABYTES: ! resp = doc_service.commit()!

Page 20: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Sending To CloudSearch: Fields

def _get_fields(table_item):! '''Return a dict with key:value for all! of the fields in the table'''! attrs = {} ! for key,value in table_item.iteritems(): ! attrs[key] = ! [item for item in value]! return attrs !!

Page 21: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Limits

!   ID sizing •  Dynamo: primary key 2K, range key 1K •  CloudSearch: document id 128 bytes

!   Items •  Dynamo: unlimited attributes, up to 64K of data •  CloudSearch: unlimited data, up to 200 fields and 100 field

values •  Attribute/field naming conventions

! Numerics

Page 22: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Keeping CloudSearch Up-To-Date

Page 23: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Conclusion

!   Combine Amazon CloudSearch and Amazon DynamoDB •  DynamoDB – throughput, durability •  CloudSearch – Rich, powerful search

!   Just a few lines of code

Page 24: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Resources !   Amazon CloudSearch Overview Page

http://aws.amazon.com/cloudsearch/ •  FAQs •  Community Forum •  Documentation & Getting Started Tutorial (IMDb)

!   Demos and Tutorials •  What Is Amazon CloudSearch •  Introducing Amazon CloudSearch (Features) •  Building a Search Application Using Amazon CloudSearch

Page 25: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Q&A

Page 26: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

We're Hiring!

Page 27: DynamoDB and Amazon Cloudsearch

© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Thank You

Jon Handler / [email protected]