40
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Guy Farber, AWS Business Development 5/19/2015 Getting Started: Storage with Amazon S3 and Amazon Glacier

AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Embed Size (px)

Citation preview

Page 1: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Guy Farber, AWS Business Development

5/19/2015

Getting Started: Storage with

Amazon S3 and Amazon Glacier

Page 2: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Agenda

• AWS Storage Options

• S3 - Scalable object storage

• Glacier - inexpensive archive storage

• Data ingest options

• Main use cases

• Q&A

Page 3: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

AWS Global Infrastructure

11 Regions

28 Availability Zones

52 Edge locations

Control your geographic locality

for performance and compliance

Page 4: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

AWS Storage Choices

Amazon S3

Durable object

storage for all types

of data

Amazon EBS

Block storage for use

with Amazon EC2

Amazon Glacier

Archival storage

for infrequently

accessed data

Economics Easy to Use Reduce risk Agility, Scale

Pay as you go

No upfront investment

No commitment

No risky capacity

planning

Self service

administration

SDKs for simple

integration

Durable and Secure

Avoid risks of physical

media handling

Reduce time to market

Focus on your

business, not your

infrastructure

Amazon EFS

File storage for use

with Amazon EC2

Page 5: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Amazon S3

Highly durable object storage for all types of data

Internet-scale storage

Grow without limits

Benefit from AWS’s

massive security

investments

Built-in redundancy

Designed for

99.999999999%

durability and 99.99%

availability

Low price per GB

per month

No commitment

No up-front cost

Page 6: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

S3 Key Features

Data Management• Cost monitoring and controls

• Lifecycle management

Ease of use• Programmatic access using AWS SDKs

• REST APIs

• Management Console, AWS CLI

Event Notifications• Delivered using SQS, SNS, or Lambda

• Enable you to trigger workflows, alerts or

other processing

Data protection• Versioning

• Cross-region replication

Security• Multi-factor authentication delete

• Flexible access control mechanisms

• Time-limited access to object

• Access logs

• Multiple client and server-side

Encryption options

Page 7: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

1 41 81 121

102% year-over-year increase in

data transfer to and from S3

(Q4 2014 vs Q4 2013, not including Amazon use)

S3 usage

Page 8: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

S3 scalability: buckets and objects

Page 9: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

S3 website: static content

Page 10: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

1 PB raw storage

800 TB usable storage

600 TB allocated storage

400 TB application data

S3 capacity pricing—pay only for what you use!

Amazon S3

Page 11: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

S3 continuous cost reduction

Available through 11 regions globally

Priced at per GB-month rates

8 price reductions since launch

51% average S3 capacity fee

reduction on 4/1/2014

TCO: comparing on-premises to S3

• Can be challenging for some

customers

• We can help!

Page 12: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Reduced redundancy option99.99% saves ~20%

Page 13: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Amazon Glacier

Archival storage for infrequently accessed data

Amazon Glacier

is optimized for

infrequent retrieval

Stop managing

physical media

Even lower cost than

Amazon S3;

Same high durability

3-5 hour retrieval latency

%5 free tier on retrievals

$0.01 per GB/month

$123 per TB/year

Replace tape libraries, VTLs

Page 14: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Glacier – 3 ways to ingest data

•Direct Glacier API/SDK

• Direct access to Glacier for deep archives

•S3 lifecycle integration

• Move older data to less expensive archive

tier

•Third party tools and gateways

• Integrate existing backup and archive

applications using an IT-friendly interface

Page 15: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Optimize your storage spending by tiering on AWS

Use Amazon Glacier

for lowest-cost,

durable cold storage

of archival data

Use Amazon S3

for reliable, durable

primary storage

Use Amazon S3 Reduced

Redundancy Storage

for secondary backups

at a lower cost

RRS

Page 16: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

S3 lifecycle policies →

Key prefix “logs/”

Transition objects to Glacier 30 days after creation

Delete 365 days after creation date

<LifecycleConfiguration>

<Rule>

<ID>archive-in-30-days</ID>

<Prefix>logs/</Prefix>

<Status>Enabled</Status>

<Transition>

<Days>30</Days>

<StorageClass>GLACIER</StorageClass>

</Transition>

<Expiration>

<Days>365</Days>

</Expiration>

</Rule>

</LifecycleConfiguration

Page 17: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Amazon S3 – advanced

features

Page 18: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

• Preserve, retrieve, and restore every version

of every object stored in your bucket

• S3 automatically adds new versions and

preserves deleted objects with delete markers

• Easily control the number of versions kept by

using lifecycle expiration policies

• Easy to turn on in the AWS Management

Console

S3 versioning

Key = photo.gif

ID = 121212

Key = photo.gif

ID = 111111

Versioning

Enabled

PUTKey = photo.gif

Page 19: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

S3 cross-region replicationAutomated, fast, and reliable asynchronous replication of data across AWS regions

Source

(Virginia)

Destination

(Oregon)

• Only replicates new PUTs. Once

S3 is configured, all new uploads

into a source bucket will be

replicated

• Entire bucket or prefix based

• 1:1 replication between any 2

regions

• Versioning required

Use cases:

• Compliance—store data hundreds of miles apart

• Lower latency—distribute data to regional customers)

• Security—create remote replicas managed by separate AWS accounts

Page 20: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

S3 event notifications

Delivers notifications to Amazon SNS, Amazon SQS, or AWS

Lambda when events occur in S3

S3

Events

SNS topic

SQS queue

Lambda function

Notifications

Foo() {…}

Page 21: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Prior to S3 VPCE

S3 virtual private endpoint (VPCE)

Using S3 VPCE

• Public IP on EC2 Instances and IGW

• Private IP on EC2 Instances and NAT

• Access S3 using S3 Private Endpoint (VPE)

without using NAT instances or Gateways

• Increased security

Amazon S3S3

Page 22: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

S3 data encryption options

Client-side encryption use AWS SDKs• You manage the encryption keys and never send them to AWS

Server-side encryption (SSE) with Amazon S3 managed keys• “Check-the-box” to encrypt your data at rest. Keys managed by S3

SSE with customer provided keys• You manage your encryption keys and provide them for PUTs and GETS

SSE with AWS Key Management Service managed keys• Keys managed centrally in AWS KMS with permissions and auditing of usage

For more details – watch Encryption and Key Management in AWS:

https://www.youtube.com/watch?v=uhXalpNzPU4

Page 23: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

AWS data ingest options

AWS Import/

Export

Internet/VPN

AWS Storage Gateway

Service

AWS Direct

Connect

Page 24: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

S3 and Glacier use cases

Cloud Storage for web applications

Origin store for content distribution

Staging area and persistent store for Big Data analytics

Backup and archive target

Page 25: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Druva InSync SaaS: Endpoint Data Protection

Druva relies on Amazon

EC2, S3 and DynamoDB

for inSync Cloud - a fully

automated, secure,

enterprise backup solution

“Building inSync Cloud on

AWS has meant a faster

time market”Milind Borate, Druva CTO

http://aws.amazon.com/solutions/case-studies/druva/

Page 26: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

• S3 can be used as durable

origin for global content

distribution

• Provides single origin for

multiple CDNs, such as

Amazon CloudFront

• Data transfer out of S3 into

CloudFront is now free!

• Optimal for serving static

web assets such as images,

videos and HTML

Single origin storage for content distribution

Amazon S3

Bucket

Edge

Location

Edge

Location

Edge

Location

Edge

Location

Edge

Location

3

3

2

2

Edge

Location

Edge

Location

Page 27: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Amazon CloudFront edge locations

AWS provides full-site,

or media asset, delivery

via a worldwide content

delivery network (CDN)

called Amazon CloudFront.

Page 28: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

SoundCloud—leveraging S3 and Glacier

for audio transcoding

• World’s leading social sound platform

• Audio files must be transcoded and

stored in multiple formats

S3 via

CloudFront

Glacier

Page 29: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Big Data analytics is a rapidly growing use case…

Page 30: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

• Common staging area for Big

Data analytics jobs

• Use distributed cluster solutions

(i.e. MapReduce) to run large-

scale processing and analysis of

data.

• Scale compute resources

without depending on storage

• Leverage a highly available

object store that can be easily

shared by multiple instances

within a cluster

S3 for staging and persistently storing Big Data

Amazon Simple Storage Service (S3)

Amazon EMR Job Flow

Amazon Ec2 Instance

Amazon CloudWatc

h The Amazon EMR job flow runs on a cluster of

Amazon EC2 Instances

Input data

Output results

Metr

ics

Page 31: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Netflix, a global video delivery provider,

uses S3 as the storage layer for Hadoop-

based Big Data applications

S3 Value:• 11 9’s of durability

• Use Versioning as protection against accidental

deletes and overwrites

• Grew quickly from a few hundred TB to many PBs

• Access the same data in S3 from multiple Hadoop

clusters

• Tight integration with Amazon EMR

S3 for Big Data: Netflix

Page 32: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

• Eliminate over-purchasing

and provisioning with virtually

limitless capacity

• Enable Information Lifecycle

Management with automated

tiering between S3 and

Glacier

• Ideal for regulatory and

compliance cases

Archive after

30 days

My S3 bucket Amazon Glacier

rawdata1

rawdata2

rawdata3 Delete after

7 years

Backup and Archive

Page 33: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Easy cloud backup with AWS Storage Gateway

Customer Datacenter

Amazon S3

AWS Storage

Gateway VM

On-Premises HostApplication

Servers

ISCSI

Works with

existing

applications

Direct Attached or

Storage Area Network Disks

AWS Storage

Gateway Service

Page 34: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Replace physical tape with AWS Storage

Gateway-VTL

Customer Datacenter

On-Premises Host

Direct Attached or

Storage Area Network Disks

Backup

Application

SCSI Tape Protocol

over iSCSI

Virtual Tape Library

Software Appliance VM

Amazon S3

AWS Storage

Gateway Service

Amazon Glacier

Virtual Tape

Library

Virtual Tape

Shelf

Page 35: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

AWS Technology Partners integrate with S3

and Glacier

Page 36: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Spot Trading implements NetApp SteelStore to

optimize backup process

Spot Trading is a

technology-focused

proprietary trading firm

built on applied

technology, using the

latest in innovation to

solve problems in the

financial markets.

NetApp SteelStore + Amazon S3 value:

• 40 hours/month reclaimed by IT team to

focus on new strategies and systems

• Annual archival cost reduced by 96%

• Two-year ROI for SteelStore appliance,

including cloud storage costs

• $500,000 potential cost avoidance by

eliminating a costly SAN upgrade

• Deduplication reduced dataset by 85%

• Restores in minutes (from cache) or 4–5

hours (from Glacier) vs. days with tape

• Encryption in flight and at rest meets data

security requirements

Glacier

Data CenterAWS

Cloud-integratedstorage appliance

NetApp SteelStore

S3

Page 37: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

What’s next?

Getting started with S3 and Glacier:

http://aws.amazon.com/s3/getting-started/

http://aws.amazon.com/glacier/getting-started/

Pricing:

http://aws.amazon.com/s3/pricing/

http://aws.amazon.com/glacier/pricing/

AWS Youtube channel:

https://www.youtube.com/user/AmazonWebServices/playlists

Page 38: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

AWS Summit – Chicago: An exciting, free cloud conference designed to educate and inform new

customers about the AWS platform, best practices and new cloud services.

Details• July 1, 2015

• Chicago, Illinois

• @ McCormick Place

Featuring• New product launches

• 36+ sessions, labs, and bootcamps

• Executive and partner networking

Registration is now open• Come and see what AWS and the cloud can do for you.

Page 39: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

CTA Script

- If you are interested in learning more about how to navigate the cloud to grow

your business - then attend the AWS Summit Chicago, July 1st.

- Register today to learn from technical sessions led by AWS engineers, hear best

practices from AWS customers and partners, and participate in some of the 30+

paid sessions and labs.

- Simply go to

https://aws.amazon.com/summits/chicago/?trkcampaign=summit_chicago_bootc

amps&trk=Webinar_slide

to register today.

- Registration is FREE.

TRACKING CODE:

- Listed above.

Page 40: AWS May Webinar Series - Getting Started: Storage with Amazon S3 and Amazon Glacier

Q&A

Learn more at: http://aws.amazon.com/s3/

http://aws.amazon.com/glacier/

[email protected]