Upload
amazon-web-services
View
1.463
Download
0
Tags:
Embed Size (px)
DESCRIPTION
We are excited to announce Amazon Glacier, a fully-managed archive service in the cloud that allows customers to store data in 'cold storage' at an extremely competitive price point. Built to support the same 11 9s durability as S3, we'll take you through Glacier, how it works, where it sits with the storage spectrum and our planned integration with S3.
Citation preview
Amazon Web Services Update | London
November 2012
Why AWS for storage & archive? AWS fundamental services Storage & archive – examples &
patterns Amazon Glacier
Getting to Glacier…
AWS is used in a variety of ways…
Storage & Archive
Estimates it has saved $500,000 in storage expenditures and cut
its disk storage array costs in half
Powers applications that allows customers to access historical
stock price information
Digital assets and usage data behind publication sites and mobile
applications
Store its vast repository of music to feed to over 15 million active users
You might be able to:
Business & technical drivers
Reduce costs
Slash storage & archive budgets
Reduce on-premise
Eliminate on premise equipment to manage archives
Change processes
Remove the need to do capacity planning
Remove aging technologies
Eliminate tape for backup and archive
You might be able to:
Business & technical drivers
Reduce costs
Slash storage & archive budgets by up to 50%
Reduce on-premise
Eliminate on premise equipment to manage archives
Change processes
Remove the need to do capacity planning
Reduce CAPEX while dramatically increasing scalability
Eliminate the need for secondary sites
Remove aging technologies
Eliminate tape for backup and archive
You might be able to:
Business & technical drivers
Reduce costs
Slash storage & archive budgets by up to 50%
Reduce on-premise
Eliminate on premise equipment to manage archives
Change processes
Remove the need to do capacity planning
Reduce CAPEX while dramatically increasing scalability
Eliminate the need for secondary sites
Eliminate 30%+ of your storage footprint
Consolidate on-premise and augment with cloud
Remove aging technologies
Eliminate tape for backup and archive
You might be able to:
Business & technical drivers
Reduce costs
Slash storage & archive budgets by up to 50%
Reduce on-premise
Eliminate on premise equipment to manage archives
Change processes
Remove the need to do capacity planning
Reduce CAPEX while dramatically increasing scalability
Eliminate the need for secondary sites
Eliminate 30%+ of your storage footprint
Consolidate on-premise and augment with cloud
Eliminate capacity planning
Eliminate provisioning for peak demand
Remove aging technologies
Eliminate tape for backup and archive
You might be able to:
Business & technical drivers
Reduce costs
Slash storage & archive budgets by up to 50%
Reduce on-premise
Eliminate on premise equipment to manage archives
Change processes
Remove the need to do capacity planning
Reduce CAPEX while dramatically increasing scalability
Eliminate the need for secondary sites
Eliminate 30%+ of your storage footprint
Consolidate on-premise and augment with cloud
Eliminate capacity planning
Eliminate provisioning for peak demand
Remove aging technologies
Eliminate tape for backup and
Remove tape archives
Cycle out aging disk arrays
AWS fundamental services
Elastic Block Store, S3 and Glacier
Fundamental Storage Options
Simple Storage Service Highly scalable object storage
1 byte to 5TB in size
99.999999999% durability
Elastic Block Store High performance block storage device
1GB to 1TB in size
Mount as drives to instances with
snapshot/cloning functionalities
Glacier Long term object archive
Extremely low cost per gigabyte
99.999999999% durability
Elastic Block Store, S3 and Glacier
Fundamental Storage Options
Simple Storage Service Highly scalable object storage
1 byte to 5TB in size
99.999999999% durability
Elastic Block Store High performance block storage device
1GB to 1TB in size
Mount as drives to instances with
snapshot/cloning functionalities
Glacier Long term object archive
Extremely low cost per gigabyte
99.999999999% durability
Very fast ‘instance’ disks
Slow, rare access Fast web object storage
Elastic Block Store, S3 and Glacier
Fundamental Storage Options
Simple Storage Service Highly scalable object storage
1 byte to 5TB in size
99.999999999% durability
Elastic Block Store High performance block storage device
1GB to 1TB in size
Mount as drives to instances with
snapshot/cloning functionalities
Glacier Long term object archive
Extremely low cost per gigabyte
99.999999999% durability
Elastic Block Store, S3 and Glacier
Fundamental Storage Options
Simple Storage Service Highly scalable object storage
1 byte to 5TB in size
99.999999999% durability
Elastic Block Store High performance block storage device
1GB to 1TB in size
Mount as drives to instances with
snapshot/cloning functionalities
Glacier Long term object archive
Extremely low cost per gigabyte
99.999999999% durability
Archive Backup DR
Amazo
n S3
Data accessed
~>10% / month
11 9s durability
Snapshots
Shorter term data
backup with rapid
RTO
Rapid RTO
Expiration policies
Amazo
n S3
RRS
Lower cost when 11
9s not required Lower cost Lower cost
Amazo
n
Glacier
Long term
archiving
Infrequent data
access (~<10%
data/month)
Use policies to
move cold backup
data for long term
retention
Retain “write once -
read never” copy in
case of worst case
scenario
On-premise On-instance Object level Long term
Locally accessible file
systems
Workloads with local data
Use case journey
On-premise On-instance Object level Long term
Locally accessible file
systems
Workloads with local data
Use case journey
AWS
On-premise On-instance Object level Long term
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Data distribution
Durable media storage
System images
Database backups
Data archives
Use case journey
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
1
Getting data into the cloud
AWS Direct Connect Dedicated bandwidth between you
site and AWS
Amazon Storage Gateway Shrink-wrapped gateway for volume
synchronization
AWS Import/Export Physical transfer of media into and
out of AWS
Direct connect, import/export and storage gateway
Getting data into the cloud
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
1
Getting data into the cloud
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
1
2 Getting data into the cloud
Disks and data
Curiosity
Curiosity
The mars.jpl.nasa.gov website is based on the open-source
Content Management System (CMS) Railo, running on
Amazon EC2
Shared storage for Railo is provided by Amazon EC2
instances running Gluster on a pool of Amazon Elastic Block
Store (EBS) volumes for consistently high performance
disk I/O.
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
1
2 Getting data into the cloud
Disks and data
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
1
2
3
Getting data into the cloud
Disks and data
Database as a
service
Relational Database
Service Fully managed database
(MySQL, Oracle, MSSQL)
DynamoDB NoSQL, Schemaless,
Provisioned throughput
database
SimpleDB Schemaless
Smaller datasets
RDS, SimpleDB, DynamoDB
Database services
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
1
2
3
Getting data into the cloud
Disks and data
Database as a
service
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
1
2
3
4 Getting
data into the cloud
Disks and data
Database as a
service
Object serving
and storage
Web accessible S3 storage…
You put in it S3 AWS stores with 99.999999999% durability
You put in it S3 AWS stores with 99.999999999% durability
Highly scalable web access to objects
Multiple redundant copies in a region
“Spotify needed a storage solution that could scale very quickly without incurring
long lead times for upgrades. This led us to cloud storage, and in that market, Amazon Simple Storage Service (Amazon S3) is the
most mature large-scale product.
Amazon S3 gives us confidence in our ability to expand storage quickly while also
providing high data durability.”
Emil Fredriksson, Operations Director
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
1
2
3
4 Getting
data into the cloud
Disks and data
Database as a
service
Object serving
and storage
On-premise
High IO performance
High network performance
On-instance Object level Long term
High IO performance
Provisioned IOPS
Backup & Restore
Locally accessible file
systems
Workloads with local data
EC2 based applications
DR deployments
Good performance
High durability
Scalability
Data distribution
Durable media storage
Very low price
High durability
Slow access
System images
Database backups
Data archives
Use case journey
1
2
3
4 5 Getting data into the cloud
Disks and data
Database as a
service
Object serving
and storage
Cold storage & archiving
You love Amazon S3 for its simplicity,
security, durability, and performance.
What we heard from you
You love Amazon S3 for its simplicity,
security, durability, and performance.
You wanted a highly secure, extremely
durable, and extremely cost effective option for archiving data for years
What we heard from you
The need…
Reliable and cheap storage of data
Data with long
retention periods
Multi-PB, infrequently
accessed data sets
spectrumdata.com.au
Our goals with Glacier…
for as little as $0.01 per gigabyte per month
Redefine data archiving
and backup:
no upfront payments
a very low price for storage
ability to scale up and down as
needed
Replace physical media for
archiving:
an easy to use storage service that is
infinitely scalable
a secure service for important data
assets
designed for an annual average
99.999999999% durability per saved
object
The solution…
Reliable and cheap storage of data
The solution…
Reliable and cheap storage of data
Same storage
durability mechanisms
as S3
The solution…
Reliable and cheap storage of data
Same storage
durability mechanisms
as S3
Trade-off on retrieval
time
The solution…
Reliable and cheap storage of data
Same storage
durability mechanisms
as S3
Trade-off on retrieval
time
3-5 hour retrieval time We assume you won’t access often
Glacier allows you to cost-effectively and securely store enterprise data offsite, making it simple, inexpensive and safe
to retain archived data for as long as desired. Common use cases include enterprise data, media assets, and research and
scientific data
Offsite archive
Glacier allows you to cost-effectively and securely store enterprise data offsite, making it simple, inexpensive and safe
to retain archived data for as long as desired. Common use cases include enterprise data, media assets, and research and
scientific data
Libraries, historical societies, non-profit organizations and
governments are increasing their efforts to preserve
valuable but aging digital content such as websites, software
source code, video games, user-generated content and
other digital artifacts
Offsite archive
Digital preservation
Amazon Glacier is cost competitive, even at scale, and
eliminates pain points like capacity planning, capital
budgeting and investments, media formats, hardware
refreshes, and off-site storage costs, shipping and
retrieving
Glacier allows you to cost-effectively and securely store enterprise data offsite, making it simple, inexpensive and safe
to retain archived data for as long as desired. Common use cases include enterprise data, media assets, and research and
scientific data
Libraries, historical societies, non-profit organizations and
governments are increasing their efforts to preserve
valuable but aging digital content such as websites, software
source code, video games, user-generated content and
other digital artifacts
Offsite archive
Digital preservation
Tape replacement
Good reasons to replace off-site tape archives
100% restore success rate – no broken or missing tapes
No lost tapes and improved security posture
No device or media admin or handling
No capacity planning
Pay as you go
No need for recurrent and risky data migrations
S3 Glacier
Bucket
Object
Vault
Archive
Create vault supported via console
What is an archive?
Any object, such as a photo, video, document or
compressed collection
It is a base unit of storage in Amazon Glacier
Upload an archive in a single request
For large archives use multipart upload API
client = new AmazonGlacierClient(credentials);
client.setEndpoint("https://glacier.us-east-1.amazonaws.com/");
ArchiveTransferManager atm = new ArchiveTransferManager(client, credentials);
UploadResult result = atm.upload(vaultName, ”MyArc “, new
File(archiveToUpload));
Java
API credentials (keys)
Region endpoint
Transfer manager
Vault & archive name
Glacier client
File to upload
var manager = new
ArchiveTransferManager(Amazon.RegionEndpoint.USEast1);
string archiveId = manager.Upload(vaultName, ”MyArchive",
archiveToUpload).ArchiveId;
.net
Transfer manager
Vault & archive name File to upload
Region endpoint
S3 Glacier
Synchronous
Immediate
Asynchronous
3-5 hours
Retrieval
Retrieval
1. Initiate a retrieval job
2. After the job completes, download the
bytes
JobParameters jobParameters = new JobParameters()
.withArchiveId("*** provide an archive id ***")
.withDescription("archive retrieval")
.withType("archive-retrieval");
InitiateJobResult initiateJobResult =
client.initiateJob(new InitiateJobRequest()
.withJobParameters(jobParameters)
.withVaultName(vaultName));
String jobId = initiateJobResult.getJobId();
Java
Initiate job
JobID to track
Glacier client
Track job
Using JobID
1. SNS topic notification
2. Call describeJob
After 3-5 hours:
client = new AmazonGlacierClient(credentials);
client.setEndpoint("https://glacier.us-east-1.amazonaws.com/");
ArchiveTransferManager atm = new ArchiveTransferManager(client, credentials);
atm.download(vaultName, archiveId, new File(downloadFilePath));
API credentials (keys)
Region endpoint
Transfer manager
Vault name & archive id
Glacier client
Download path
Java
Download job
var manager = new ArchiveTransferManager(Amazon.RegionEndpoint.USEast1);
var options = new DownloadOptions();
options.StreamTransferProgress += ArchiveDownloadHighLevel.progress;
manager.Download(vaultName, archiveId, downloadFilePath, options);
static int currentPercentage = -1;
static void progress(object sender, StreamTransferProgressArgs args)
{
if (args.PercentDone != currentPercentage)
{
currentPercentage = args.PercentDone;
Console.WriteLine("Downloaded {0}%", args.PercentDone);
}
}
.net
Download job
“Every day our genome sequencers produce
terabytes of data. As our company moves into
the clinical space, we face a legal
requirement to archive patient data for years
that would drastically raise the cost of
storage.
Thanks to Amazon Glacier’s secure and
scalable solution, we will be able to provide
cost-effective, long-term storage and thereby
eliminate a barrier to providing whole genome
sequencing for medical treatment of cancer
and other genetic diseases.”
Keith Raffel, Senior Vice President and Chief Commercial Officer, Complete Genomics
“An organization like ours thinks in centuries when it comes to content retention, and long term preservation of our Master Archives is a
critical part our mission here at NYPR.
Storing these core assets on traditional media such as local disk and off-site tape exposes us to
corruption and even outright-loss of data. We are excited to move our archives to Amazon
Glacier, which will be a better long-term solution.”
Steve Shultis, CTO, New York Public Radio
Use Glacier through S3 APIs
Policy based tiered storage
Desktop clients
S3 integration coming soon
Pricing
Storage
From $0.1 per GB
Data In
Free
Data Out
Tiered (1st GB free)
Retrievals
Free up to 5% of
average monthly
storage the tiered
fees
Storage
From $0.1 per GB
Data In
Free
Data Out
Tiered (1st GB free)
Retrievals
Free up to 5% of
average monthly
storage the tiered
fees
Anticipation is archives will be accessed infrequently
Storage is cheap, trade-off on retrieval pricing
Benefits of Amazon Glacier
Secure Low cost
Simple Durable
Flexible
Use multiple services
As little as $0.01/GB/month with no up-front capital commitments.
Secure and durable technology platform with industry-recognized certifications and audits.
Average annual durability of 99.999999999% per archive.
Eliminate hardware, software, and capacity planning.
Easily leverage other AWS services once your data is in the AWS cloud.
Add any amount of data, quickly. Easily expire and delete without handling media.
http://aws.amazon.com/glacier/