60
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. David Stein, Business Development EBS November 30, 2016 Case Study: How Zendesk and Videology Modernized Their Big Data Platforms on Amazon EBS STG311

AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Embed Size (px)

Citation preview

Page 1: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

David Stein, Business Development EBS

November 30, 2016

Case Study: How Zendesk and

Videology Modernized Their Big

Data Platforms on Amazon EBS

STG311

Page 2: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

What to Expect from the Session

• How to architect big data processing platforms to scale to meet

growing demand while improving performance, availability, and cost

with Amazon EBS

• Learn how about new ST1 and SC1 Throughput Optimized EBS

volumes designed for big data workloads

• Overview of how Zendesk runs a large ELK (Elasticsearch,

Logstach, Kibana) on Amazon EC2 and EBS for their cloud-based

customer support platform

• Overview of how Videology runs a Hadoop architecture on EC2 and

EBS to ingest, process, and analyze logs for their converged

advertising solution

Page 3: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Amazon EFS

File

Amazon EBSAmazon EC2

Instance Store

Block

Amazon S3 Amazon Glacier

Object

AWS storage is a platform

Data Transfer

AWS Direct

Connect

ISV

Connectors

Amazon

Kinesis

Firehose

AWS Storage

Gateway

Amazon S3

Transfer

Acceleration

AWS

SnowballAmazon

CloudFront

Internet/VPN

Page 4: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

EBS volume types

Hard disk drive

(HDD)Solid state drive

(SSD)

Page 5: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

EBS volume types

General Purpose

SSD

gp2

Provisioned IOPS

SSD

io1

Throughput Optimized

HDD

st1

Cold

HDD

sc1

SSD HDD

Page 6: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

EBS volume types: throughput

Throughput

Optimized HDD

st1

Baseline: 40 MB/s per TB up to 500 MB/s

Capacity: 500 GB to 16 TB

Burst: 250 MB/s per TB up to 500 MB/s

Ideal for large-block, high-throughput sequential workloads

Page 7: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Cold HDD

sc1

EBS volume types: throughput

Baseline: 12 MB/s per TB up to 192 MB/s

Capacity: 500 GB to 16 TB

Burst: 80 MB/s per TB up to 250 MB/s

Ideal for sequential throughput workloads such as logging and backup

Page 8: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Kyle House, David Bernstein, Zendesk

November 30, 2016

Case Study: How Zendesk Modernized

Their Big Data Platforms on Amazon EBS

Inside Our New ELK Deployment

Page 9: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Zendesk builds software for better customer relationships. It empowers organizations to improve customer engagement and better understand their customers. More than 87,000 paid customer accounts in over 150 countries and territories use Zendesk products. Based in San Francisco.

Page 10: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

What to Expect from the Session

• Discuss storage redesign, utilizing

new Amazon EBS volumes

• Talk through design choices

• Explain benefits of new storage

• model

• Cost benefits of “rightsizing” storage

Page 11: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

ELK at Zendesk

Distributed database

Log ingestion/parsing

Beautiful visualizations

Page 12: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 13: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 14: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 15: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

The Problem

- Operational headaches

- Encryption

- Data retention

- Cost too high

Page 16: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 17: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 18: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

The Investigation

- User access patterns

- Performance requirements

- New EBS volume types

Page 19: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 20: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 21: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

The Proposal

- Full usage of EBS with new volume types

- Create a tiered storage model

- Optimize instance types; decouple instances from

storage

Page 22: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Tiered storage

Hot (0-7 days)General Purpose

SSD (gp2)

Warm (8-30 days)Throughput

Optimized HDD (st1)

Cold (31-60 days) Cold HDD (sc1)

Page 23: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 24: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 25: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 26: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 27: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 28: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 29: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 30: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 31: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 32: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Topology

VPN

gateway

3 x m4.large

esclient/esmaster

Proxy

Bastion

3 x m4.large

esclient/esmaster

gp2 roots

8 x c4.large

logindexers

8 x c4.large

logindexers

gp2 roots

gp2 roots

gp2 roots

gp2 roots +

11G (hot)

st1

35G (warm)

sc1

80g (cold)

10 x r3.2large

esdata

10 x r3.2large

esdata

gp2 roots +

11G (hot)

st1

35G (warm)

sc1

80g (cold)

Availability Zone

Availability Zone

Page 33: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 34: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Sparkleformation

Page 35: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 36: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 37: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)
Page 38: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

The Result

- Reduced operating costs by 50%

- Increased data retention 3x

- Predictable scaling model• Storage allocation detached from instance count

- Increased data transport reliability

- Reduced operational overhead

- Increased cluster stability

Page 39: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

49% Reduction 79% Reduction

Page 40: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Recommendations

- Identify data usage model before you build

- Find places where performance matters, and where

cost can be optimized

- Reduce over-provisioned storage/IOPS

- Utilize AWS managed services whenever possible

Page 41: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Thank you!

Up next in this session:

Videology

Page 42: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

VideologyPaul Frederiksen – Principal DevOps Engineer

David Ortiz – Senior Software Engineer

Videology Big Data Team

November 30, 2016

Page 43: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

On the Rocky Road to EBS

Videology’s Journey to EBS-backed Big Data

Page 44: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

What to Expect from the Session

• Intro to Videology

• Challenges

• Road to EBS-backed cluster

• Happy engineers

Page 45: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Videology overview

Founded:2007 by Scott Ferber, co-founder of Advertising.com, which sold to

AOL Time Warner in 2004 for $497 Million

Corporate

Headquarters:New York, NY

Operations:• Operating in 28 Global Markets

• Key Offices – New York, Baltimore, Toronto, London, Singapore

& Sydney

Employees: Approximately 380

InvestorsNEA, Comcast Ventures, Harbourvest, Catalyst Investors,

Pinnacle Ventures, Valhalla Venture

Customers:4,500 Active Users including Brand marketers, agencies, trading

desks, media companies, MVPD’s

Ecosystem

Integrations:

Open platform with 2200+ ecosystem integrations, including 1000+

media companies, 40 data providers, all major 3rd party

verification providers, and dozens of technology partners across

the media ecosystem

Recent Client

Wins:

Videology provides a

converged advertising solution

that is screen-agnostic,

ensuring unduplicated reach

with the right frequency

cadence to achieve

guaranteed results.

45

Page 46: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Industry accolades…

Videology was named Best Digital Video Ad Platform by Cynopsis Media at their

2015 Model D Awards.“ ”

Videology was able to show that their platform drove brand lift that was on average 6X

higher than Nielsen's norms.“ ”

Videology has the most sophisticated media optimizer to analyze the right

allocation of TV and online video to optimize reach and campaign cost.“ ”

Page 47: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Hadoop overview

NameNode

ResourceManager

Gateway

DataNode

NodeManager

Page 48: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Where does big data processing fit in?

Page 49: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Original production

Instance

Type

Qty Role vCPU RAM

(GB)

Storage

(GB)

m3.xlarge 1 Jumpbox 4 15 80

m3.xlarge 1 Cloudera

Manager

4 15 80

m3.2xlarge 2 NN/RM 8 30 160

cc2.8xlarge 1 Service Master 32 60 3,200

cc2.8xlarge 30 Worker 32 60 3,200

Page 50: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

I’ve got 99 problems and Hadoop is a few of them

Reliability

Scalability

Distcp

CPU to Memory Ratio

Page 51: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

2015Q2

2016Q3

2016Q4 and beyond

Engaged Cloudera

for EBS support

Gave up on EBS

and tested D2s

New EBS to

the rescue!

Take advantage of

new hardware

Page 52: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

CC2.8XL M4.10XLD2.8XL

Old

Not enough disk

Expensive

NirvanaLots of disk!

Not enough memory

Expensive

Page 53: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

D2.8xl prototype

Instance

Type

Qty Role vCPU RAM

(GB)

Storage

(GB)

r3.large 1 Jumpbox 2 15.25 32

r3.large 1 Cloudera

Manager

2 15.25 32

r3.xlarge 2 NN/RM 8 30 160

r3.2xlarge 2 Service Master 8 61 160

d2.8xl 10 Worker 36 244 48,000

Page 54: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

M4.10xlarge w/ sc1 prototype

Instance

Type

Qty Role vCPU RAM

(GB)

Storage

(GB)

r3.large 1 Jumpbox 2 15.25 32

r3.large 1 Cloudera

Manager

2 15.25 32

r3.xlarge 2 NN/RM 8 30 160

r3.2xlarge 2 Service Master 8 61 160

m4.10xlarge 18 Worker 40 160 4,000

Page 55: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

M4.10xlarge w/ st1 prototype

Instance

Type

Qty Role vCPU RAM

(GB)

Storage

(GB)

r3.large 1 Jumpbox 2 15.25 32

r3.large 1 Cloudera Manager 2 15.25 32

r3.xlarge 2 NN/RM 8 30 160

r3.2xlarge 2 Service Master 8 61 160

m4.10xlarg

e

18 Worker 40 160 8,000

Page 56: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Problems no more!

• No more rebuilding Nodes

• 1 critical incident since switch vs. 5 in the year prior to release

• Get to play with kids instead of babysitting cluster

Page 57: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Engineering benefits - capacity

No longer restricted by

memory, we now have

resources to pursue other

tools to improve our reliability

and speed:

• Spark

• HBase

• Flafka

• Offloading processing from

Amazon Redshift to CDH

More resilient to log volume

increases

Can expand storage as

requirements changes

Page 58: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Financial benefits

$0.00

$5,000.00

$10,000.00

$15,000.00

$20,000.00

$25,000.00

$30,000.00

Total Cost Cost by Utilization

Cc2 M4

$0.00

$0.01

$0.02

$0.03

$0.04

$0.05

$0.06

$0.07

$0.08

$0.09

$0.10

Cost to Process 1000 Requests

Cc2 M4

Page 59: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Thank you!

Questions?

Page 60: AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

Remember to complete

your evaluations!