Design for Scale - Building Real Time, High Performing Marketing Technology presented with Bizo

Preview:

DESCRIPTION

DynamoDB presented by David Pearson from AWS Bizo Business Audience Marketing success story on AWS by Alex Boisvert, Director of Engineering, Bizo In today's world, consumer habits change fast and marketing decisions need to be made within seconds, not days. Delivering engaging advertising experiences requires real time, high performing architectures that provide digital advertisers the ability to measure and improve the performance of their campaigns and tie them more closely to corporate goals. The insights gleaned from the massive amounts of data collected can then be used to dynamically adjust media spend and creative execution for optimal performance. The AWS Cloud enables you to deliver marketing content and advertisements with the levels of availability, performance, and personalization that your customers expect. Plus, AWS lowers your costs. Join us to learn about how big data and low latency / high performing architectures are changing the game for digital advertising.

Citation preview

October 3, 2013

Design for Scale Building Real Time, High Performing

Marketing Technology

David Pearson - Business Development

Amazon RDS

Amazon DynamoDB Amazon Redshift

Amazon ElastiCache

Compute Storage

AWS Global Infrastructure

Database

Application Services

Deployment & Administration

Networking

AWS Database

Services

Scalable High Performance

Application Storage in the Cloud

provision

manage

scale

EFFORT

differentiated?

AWS Big Data Services

Redshift DynamoDB

Elastic MapReduce Amazon S3

Object Storage

Batch Processing

Real-Time Transactions

Online Analysis and Reporting

Amazon DynamoDB

NoSQL Database

Predictable performance

Seamless & massive scalability

Fully managed; zero admin

Amazon DynamoDB

Amazon’s Path to DynamoDB

RDBMS DynamoDB

Amazon DynamoDB

DEVS

OPS

USERS

Fast Application

Development

Time to Build New Applications

• Flexible data models • Simple API • High-scale queries • Laptop development

Amazon DynamoDB

DEVS

OPS

USERS

Latest News… DynamoDB Local

• Disconnected development

• Full API support

• Download from http://aws.amazon.com/dynamodb/resources/#testing

Amazon DynamoDB

DEVS

OPS

USERS

Admin-Free (at any scale)

request-based capacity provisioning model

Provisioned Throughput

Throughput is declared and updated via the API or the console

CreateTable (foo, reads/sec = 100, writes/sec = 150)

UpdateTable (foo, reads/sec=10000, writes/sec=4500)

DynamoDB handles the rest

Capacity is reserved and available when needed

Scaling-up triggers repartitioning and reallocation

No impact to performance or availability

Amazon DynamoDB

DEVS

OPS

USERS Durable Low Latency

WRITES Replicated continuously to 3 AZ’s

Persisted to disk (custom SSD)

READS Strongly or eventually consistent

No latency trade-off

Average < 3ms TP90 < 4.5ms

server-side latency across all APIs

AD SERVING

EC2

Profiles Database

ad request

ad url

visitor

Ad Servers

DynamoDB

1. Visitor loads a web page

2. Web page issues a request to ad servers on EC2

3. Query to DynamoDB returns the ad to display

4. Link is returned to visitor

Real Time Bidding

EC2

Profiles Database Ad Servers

DynamoDB

EC2

Profiles Database Ad Servers

DynamoDB

RTB platform

Bidder DynamoDB

Ads Profiles Queues and Buffer bid response

20 ms

20 ms 20 ms 40 ms

Request network transit

Response network transit Decision on best ad and bid price based on

optimization that needs multiple data look-ups Contingency time buffer

bid request

EC2

Profiles Database

ad request

ad url

visitor

Ad Servers

DynamoDB

visitor

Optimize for scale, elasticity, and availability

• Multi-AZ: maintain EC2 capacity in multiple availability zones

• Auto Scaling: scale EC2 capacity to automatically manage variations in workload

• Elastic Load Balancing: automatically distribute incoming traffic across multiple EC2 instances

EC2 (MAZ)

ad request

ad url

Ad Servers

DynamoDB Elastic Load Balancing

Profiles Database

visitor

1. Ad files are downloaded from CloudFront

2. Impressions captured into logs on S3

CloudFront

advertisement

impression logs

Static Repository Files

Amazon S3

EC2 (MAZ)

ad request

ad url

Ad Servers

DynamoDB Elastic Load Balancing

Profiles Database

CloudFront

advertisement

impression logs

Static Repository Files

Amazon S3

Profiles Database

EC2 (MAZ)

ad request

ad url

Ad Servers

DynamoDB Elastic Load Balancing

visitor

Click-through requests are

captured via EC2 into log

files and persisted on S3

Click-through Servers

click through log files

click through requests

Elastic Load Balancing

EC2 (MAZ)

Analysis

CloudFront

advertisement

impression logs

Static Repository Files

Amazon S3

Profiles Database

EC2 (MAZ)

ad request

ad url

Ad Servers

DynamoDB Elastic Load Balancing

visitor

new bids

updated profiles

new requests

Redshift

ETL

Amazon EMR

unstructured log files

Click-through Servers

click through log files

click through requests

Elastic Load Balancing

EC2 (MAZ)

Amazon Redshift

Drive qualified users to advertiser’s sites

• Ad server logs • 3rd party data

• Bid history • User history

Bid Optimization

Business Analytics using Redshift

Optimize return on advertising expenditure

• Impressions • 3rd party data

• User history

• Enrichment

Cost Optimization

Optimizing the Data Tier

DynamoDB

cookies

writes reads

PutItem: insert new cookies into table

CreateTable

{ …

"ProvisionedThroughput": {

"ReadCapacityUnits": “100",

"WriteCapacityUnits": “10000"

},

"TableName": “User_Cookies_0"

}

User_Cookies_0

hash=userid

range=timestamp

<cookie payload>

DynamoDB

cookies GetItem: lookup profile table, return action (url)

User_Profile

hash=userid

<profile data>

url

User_Cookies_0

hash=userid

range=timestamp

<cookie payload>

DynamoDB

cookies

Time Series Data

CreateTable: new cookie ingest table

PutItem: insert new cookies into new table

User_Cookies_0

hash=userid

range=timestamp

<cookie payload>

User_Profile

hash=userid

<profile data>

User_Cookies_1

hash=userid

range=timestamp

<cookie payload>

url

User_Cookies_1

hash=userid

range=timestamp

<cookie payload>

DynamoDB

cookies

UpdateTable: prepare data for direct load into Redshift

User_Cookies_0

hash=userid

range=timestamp

<cookie payload>

User_Profile

hash=userid

<profile data>

writes reads

Redshift

url

User_Cookies_1

hash=userid

range=timestamp

<cookie payload>

DynamoDB

Redshift

cookies

User_Cookies_0

hash=userid

range=timestamp

<cookie payload>

User_Profile

hash=userid

<profile data>

COPY cookie_staging

userid

timestamp

:

insert new entries

user_history

userid

timestamp

:

url

Redshift

user_history

userid

timestamp

:

cookie_staging

userid

timestamp

:

updated

profiles

Conditional PutItem (insert new / update existing items)

User_Cookies_1

hash=userid

range=timestamp

<cookie payload>

DynamoDB

cookies

User_Profile

hash=userid

<profile data>

url

DeleteTable

query: build

profile

Four Drivers of DynamoDB Adoption

Resources

David Pearson pearsond@amazon.com

Best Practices http://aws.amazon.com/dynamodb/resources/

Scalable Easy To Use (Durably) Fast Inexpensive

Questions David Pearson

pearsond@amazon.com

© 2013 Bizo, Inc

The Future of Digital

Advertising with

Cloud Computing

October 3rd, 2013

© 2013 Bizo, Inc

© 2013 Bizo, Inc

Marketing Automation

Nurture Anonymous Site Visitors • 90% of site traffic doesn’t convert • Marketing automation rules to optimize display ads

Extend Email Nurturing to Display Advertising ● 70% of email are not opened ● Automatically coordinate display and email messaging ● Feedback Loop: metrics on performance, lift, ROI

Integrations • Eloqua (now subsidiary of Oracle) + Other Platforms

© 2013 Bizo, Inc

Targeting API

Dynamic Targets • Business Professionals (Cookies + User Ids) • Companies (by name or domain) • IPs (single or range)

Attribution / Analytics ● Impressions, Clicks, Conversions, New Visitors, etc.

Key Metrics (beta) • 300M cookies and user-ids (e.g., sha1 email hashes) • 30M company records -> 200M employee cookies • 100M Company -> IP range mappings

© 2013 Bizo, Inc

Environment

Project / Team • 8 Months, 1 - 3 Engineers, ≈ 18 Man-Months • Automated Deployment & Monitoring (“DevOps”)

Language • Java / Scala (JVM)

Infrastructure • EC2, ELB, AutoScaling • MySQL (RDS), DynamoDB, S3 • SQS, EMR, Cloudwatch, etc.

© 2013 Bizo, Inc

Architecture

© 2013 Bizo, Inc

SQL vs NoSQL: Different Starting Points

© 2013 Bizo, Inc

SQL vs NoSQL: How Much Do You Need?

???

© 2013 Bizo, Inc

SQL vs NoSQL: False Dilemma?

New

SQL ?

© 2013 Bizo, Inc

Database “Sweet Spots”

MySQL/RDS (SQL)

DynamoDB (NoSQL)

Access Pattern Small/Medium

Batches Per-Record;

Random

Scaling Moderate Growth Near Real-Time

Sharding Bring-Your-Own Built-In;

Transparent

Operations Self-managed;

Some Labor Costs Fully Managed

© 2013 Bizo, Inc

Challenges

Common: Data Modeling/Indexing for Performance

MySQL/RDS • Sharding • Rebalancing • Online Schema Migration

DynamoDB (NoSQL) ● Understanding Performance/Provisioned Capacity Model

(Throttling / Non-Uniform Access Pattern) ● Expiring Data

Recommended