Upload
amazon-web-services
View
1.518
Download
2
Embed Size (px)
DESCRIPTION
The first step in a successful cloud-based media workflow is getting the content transferred and stored. From there you can achieve massive efficiencies for downstream processing and delivery via content access instead of content transfer. In this session you'll learn about best practices for ingesting content to the cloud; relevant AWS partners within the media ecosystem; how to use storage tiers based on the business value of your assets; and how to eliminate tape, tape museums, and tech refresh within your long term archive strategy; and ultimately how to remonetize archived assets.
Citation preview
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Media Content Ingest, Storage, and Archiving
with AWS – MED301
John Downey, Amazon Web Services, Business Development Manger - Storage
November 13, 2013
Agenda – Content Ingest, Storage, and Archiving
• AWS components – Ingest
– Storage
– Archive
• Partner components – Ingest
– Storage
– Archive
• TCO / ROI considerations
AWS Global Infrastructure
9 Regions
25 Availability Zones
46 Edge locations
AWS Regions and Availability Zones
Customer decides where applications and data reside
Asia Pacific (Tokyo) US West (Oregon) EU (Ireland) US East (N. Virginia)
US West (N. Cal) (Asia Pacific) Singapore
AWS GovCloud (US) South America (Sao Paulo) Asia Pacific (Sydney)
Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone Availability
Zone
Availability
Zone
Availability
Zone Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone
Availability
Zone
AWS Ingest Options
AWS Direct Connect Dedicated bandwidth between
your site and AWS
AWS Storage Gateway On-premises storage federation with
Amazon S3 and Amazon Glacier
AWS Import/Export Physical transfer of media into and
out of AWS
AWS Ingest Options – One Common Theme: Parallel Uploads
1. Multipart upload
2. Request rate optimization
3. TCP window scaling
4. TCP selective
acknowledgement
AWS has customers that ingest roughly 1 PB per day
AWS Ingest Options AWS Direct Connect
• Reduces costs for bandwidth-
heavy workloads
• Private connectivity to AWS – Physical connection – 1 Gbps or 10
Gbps port
– Logical connections (802.1q
VLANs)
Public: To AWS cloud (Amazon EC2,
Amazon S3 etc.)
Private: To VPCs
• Consistent network performance
• Compatible with all AWS services
AWS Ingest Options AWS Direct Connect
Cost • 1 Gbps port = $0.30/hour
• 10 Gbps port = $2.25/hour
• Data transfer IN = $0
• Data transfer OUT = $0.02 – 0.11 per GB depending upon location
– Can be a significant savings vs. Internet bandwidth out
Locations
• CoreSite 32 Avenue of the Americas, NY
• CoreSite One Wilshire & 900 North Alameda, LA
• Equinix DC1 – DC6 & DC10 - DC11, Ashburn, VA
• Equinix SV1 & SV5, San Jose, CA
• Equinix SE2 & SE3, Seattle, WA
• Equinix SG2, Singapore
• Equinix SY3, Sydney
• Equinix TY2, Tokyo
• Eircom, Clonshaugh
• TelecityGroup Docklands, London
• Terremark NAP do Brasil, Sao Paulo
AWS Ingest Options AWS Import/Export
• Rapidly move data into and out
of AWS
• Portable storage device
shipment to AWS – eSATA
– USB 2.0 and 3.0 (including USB flash
drives)
– 2.5 and 3.5 inch internal SATA hard drives
• Supports – Amazon Elastic Block Store (EBS)
– Amazon Simple Storage Service (S3)
– Amazon Glacier
• Use cases – Initial content migration
– Content distribution via portable devices
– Disaster recovery
AWS Ingest Options AWS Import/Export
• Cost – $80 per storage device handled
– $2.49 per data loading hour
– Standard pricing for • Amazon S3
• Amazon EBS
• Amazon Glacier
AWS Ingest Options AWS Storage Gateway
• On-premises, virtual iSCSI
storage appliance
• Local cache enables low
latency access to data – Gateway – stored volumes
– Gateway – cached volumes
• Copies data in the form of
Amazon EBS snapshots to
Amazon S3
• Leverage Amazon S3 server-
side encryption
• Recent patch results in up to
5 TB of throughput per day
• Recover to Amazon EBS /
Amazon EC2
AWS Ingest Options AWS Storage Gateway
• Cost (N. Virginia – varies per region)
– Gateway pricing • $125 per activated gateway/mo.
– Volume pricing • $0.095 per GB per month of data stored
– Snapshot pricing • $0.095 per GB per month of data stored
– Tiered data transfer pricing
model • Free inbound
• $0.12 - $0.05 per GB outbound
depending on tier
AWS Ingest Options Gateway-Virtual Tape Library (Gateway VTL)
• On-premises, virtual tape library
storage appliance
• 10 virtual tape drives / 1500
virtual tape slots
• 150 TB local cache – VTL – virtual tape library
• Restore in seconds from VTL
– VTS – virtual tape shelf • 24 hour retrieval from VTS
• Encryption in transit and at rest
• Gateway VTL-AMI
• In lab we achieved 55 MB/s
upload throughput and 90 MB/s
iscsi ingest rate per gateway
AWS Ingest Options Gateway-Virtual Tape Library (Gateway VTL)
• Cost (N. Virginia – varies per region)
– Gateway pricing • $125 per activated gateway/mo.
– Virtual tape shelf storage • $0.01 per GB per month of data stored
– Virtual tape library storage • $0.095 per GB per month of data stored
– Retrieval from virtual tape shelf • $0.30 per GB
– Virtual tape deletes • Free
AWS Storage and Archive Options
Amazon Simple Storage Service (S3) Highly scalable object storage
1 byte to 5 TB in size
99.999999999% durability
Amazon Elastic Block Store (EBS) High-performance block storage device
1 GB to 1 TB in size
Mount as drives to instances with
snapshot/cloning functionalities
Amazon Glacier Long-term object archive
Extremely low cost per gigabyte
99.999999999% durability
AWS Storage Options
Amazon Elastic Block Store (EBS)
• High I/O block storage for Amazon EC2
• Predictably scale to 1000s of IOPS per
Amazon EC2 instance
• Automatic replication within the Availability
Zone
• 10x more reliable than commodity disk drives
• Point-in-time snapshots • Amazon S3 durability (11-9s)
• Point-in-time snapshots across regions
• Amazon CloudWatch • Exposes Amazon EBS performance metrics
AWS Storage Options Amazon Elastic Block Store (EBS)
Costs (US East) Amazon EBS standard volumes
$0.10 per GB-month of provisioned storage
$0.10 per 1 million I/O requests
Amazon EBS provisioned IOPS volumes
$0.125 per GB-month of provisioned storage
$0.10 per provisioned IOPS-month
Amazon EBS snapshots to Amazon S3
$0.095 per GB-month of data stored
AWS Storage Options Amazon Simple Storage Service (S3)
• Synchronous in and synchronous out
object storage
• Designed for 99.999999999% durability
• Authentication mechanisms ensure data
is kept secure
• Multiple encryption options
– Amazon server-side encryption
• Standard storage
• Reduced redundancy storage (RRS)
AWS Storage Options Amazon S3: Over 2 Trillion Total Objects
AWS Storage Options Amazon Simple Storage Service (S3)
Costs (US East)
Standard
Storage
Reduced Redundancy
Storage
First 1 TB / month $0.095 per GB $0.076 per GB
Next 49 TB / month $0.080 per GB $0.064 per GB
Next 450 TB / month $0.070 per GB $0.056 per GB
Next 500 TB / month $0.065 per GB $0.052 per GB
Next 4000 TB / month $0.060 per GB $0.048 per GB
Over 5000 TB / month $0.055 per GB $0.037 per GB
AWS Archive Options
Amazon Glacier
• $0.01 - GB per month
• Retrievals:
– 5% of monthly average storage (pro-rated daily) free
• Synchronous in
• 3–5 hour asynchronous retrieval
• Designed for 99.999999999% durability
• AES 256 encryption at rest
• Highly scalable
• Reliable
• Authentication mechanisms ensure data is kept secure
AWS Archive Options Object Lifecycle Management: Amazon S3 → Amazon Glacier
→
• Seamlessly move data from Amazon S3 → Amazon Glacier
• 3-5 hour asynchronous retrieval
• Data lifecycle policies
• $0.01 per GB for Amazon Glacier costs
Partner Ingest Options
Aspera Up to 1 Gb/s per instance to AWS
Signiant High-speed, network-efficient file transfer –
up to 200X faster than FTP with 95+%
network efficiency
CloudBeam SaaS-based file transfer into and out
of AWS
Partner Ingest Options Aspera On-Demand
• Achieve file transfer speeds that are 1000s of times faster the FTP
• In, out, and across the cloud with enterprise-grade security
• End-to-end security
• Speeds of up to 1 Gbps per AWS instance
• 10 TB per 24 hours
• Scale-out architecture
• Web, mobile, embedded clients
Partner Ingest Options Attunity CloudBeam
• Simplifies, automates, and
accelerates the loading and
replication of files from on-
premises, heterogeneous
sources to and from
Amazon S3
• Common Use Cases: – Content availability and distribution
– Data analysis (Amazon EMR Hadoop)
– Backup, disaster recovery, and archiving
– Region-to-region replication
• AWS-based fast file transfer
as a service
• 200X faster than FTP
• Separates control layer from
the data layer
• Multiple sources and targets
including Amazon S3
• Firewall-friendly transfers
with autoselecting UDP,
TCP, and HTTP transport
options
NAS (CIFS, NFS)
DAS / SAN
Partner Ingest Options Signiant Media Shuttle
Partner Ingest Options Cycle Computing DataManager
• Move data from any NAS / file
system to Amazon S3 and/or
Amazon Glacier
• Clean up expensive, on-premises
disk
• Maintain full access to all content
• Reduce or eliminate future data
migrations upon hardware refresh
Partner Storage and Archive Solutions
Avere Systems Record performance, scale-out,
single file system NAS
Panzura Cloud-integrated local NAS capabilities for
the globally distributed enterprise
Partner Storage and Archive Solutions Cloud Storage Gateway Solutions
Partner Storage Options Example: Cloud Storage Gateway – Global Namespace NAS
Partner Storage Options Avere Systems – Comparing 1,000,000 IOPS Solutions
• Add high-performance, scale-out clustering with any NAS
• Automated tiering
• Separates performance scaling from capacity
• Avere offers the leading $ per IOPS for NAS – $2.3/IOPS
• 80% less total equipment than traditional NAS systems
• Fastest scale-out, single file system (NAS) available
• Linear scaling to millions of operations/sec
• Tens of GB/sec of throughput
150ms
Avere
$2.3 / IOPS
Cloud
Latency
Partner Storage Options Avere Systems
• Amazon S3 integration by EOY 2013 – 3-step process:
1. Leverage Avere to accelerate current NAS System
2. Nondisruptive migration to Amazon S3 / Amazon Glacier – FlashMove
3. Switch mode in Avere to enable primary NAS operations
– Retire older NAS gear
Avere FXT Edge Filer
Purpose-built for cloud
Enterprise-class scaling
Lowest TCO
Compute
Farm
Client
Workstations
Legacy NAS Show as complex w/
RAID, volume limits, low
utilization, mirror
schedules, etc.
Core Filers
WAN
On Premise AWS
Amazon
S3
Amazon
Glacier
Partner Storage Options Panzura
• Panzura enables:
• Global sharing – On-premises, hybrid, across AWS regions
• Panzura Amazon Machine Image (AMI)
• Small physical footprint
• Separation of data and metadata
• Data protection
• NAS centralization
• Shift ratio of Opex vs. Capex
TCO: On-Premises Cost Considerations
1. Primary storage hardware (primary / remote site)
2. DR / Remote site storage hardware
3. Raw to utilized storage (both primary and DR)
4. Storage growth (cost of upgrades)
5. Storage management software and 3rd party tools
6. Professional services
7. Hardware maintenance
8. Software maintenance
9. Backup software
10.Backup hardware (primary / remote site)
11.Offsite tape storage / vault
12.Archive software
13.Archive hardware
14.Power
15.Cooling
16.Space
17.Labor
18.Cost of capital
19.Training
20.Asset depreciation
21.Migration
22.Decommission / remove
23.Recycle
Summary
AWS ingest, storage, and archive solutions:
• AWS Import/Export + Amazon S3, Amazon EBS, Amazon Glacier
• AWS Storage Gateway + Amazon S3
• AWS VTL + Amazon S3 + Amazon Glacier
Partner-based ingest solutions:
• Aspera on-demand solution + Amazon S3
• Attunity + Amazon S3
• Signiant Media Shuttle + Amazon S3
• Cycle Computing’s DataManager + Amazon S3 + Amazon Glacier
Partner-based storage / archive solutions:
• Avere Systems + Amazon S3 and Amazon Glacier
• Panzura + Amazon S3 and Amazon Glacier
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Glacier at iN DEMAND
Michael Raposa, iN DEMAND
iN DEMAND Intro
• World leader in providing transactional
entertainment delivered through television
• Joint Venture – owned by Comcast, Time
Warner, & Cox
• Pay-per-view programming – MLB, NBA, NHL,
boxing, MMA, & Howard Stern
The Problem
• 1.5 PB video archive – World War Z
– Ice Pirates
– Titanic II … Tsunami AND an Iceberg
• Tape storage – Tape corruption and bit rot
– Lost tapes
– Physical storage – 1.5 PB is a lot of tapes
– Legacy tape formats – LTO-1, 2, 3, 4, 5, etc. etc.
The Problem (cont.)
• Manual asset tracking – Typical backup system stores file name, date, size
– Important metadata is tracked separately, e.g. bit rate, aspect
ratio, closed captioning, dual language, codec.
– Inventory issues –What bit rates do we have for Spider Man?
– Multiple storage – “Put it on tape just to be sure”
• Manual archive and restore – Wait for operator to handle restores – not 24x7
The Problem (cont.)
• Expensive – Tape operator
– Tape storage
– Yearly tape library maintenance
• Limited scale – Limited by tape library capacity
– Limited by physical space
The Solution – Mini-DAM
• Limited digital asset management system for
Glacier – Web UI
– Glacier storage
– $50 K
– Hosted at AWS – EC2, Amazon RDS, Amazon SNS,
Amazon SES
– Over 300 TB in Glacier to date
– Adding about 2 TB / day
Tips & Tricks
• Concurrent downloader required – Users want files FAST!!!
– .NET and JAVA AWS SDK have only a single-threaded downloader – MAX download c.a. 160 Mbps
– iN DEMAND wrote a multithreaded downloader
– Added to AWS SDK for Python (BOTO) – MAX. download 600 Mbps
• Per archive Glacier overhead – Every Glacier archive has a 32 kb overhead for metadata
– You are charged for this overhead
– For small files that 32 kb starts to add up
– Zip up small files before uploading
Tips & Tricks
• Download request time outs – 24 hours to download archive
– Queue up requests to ensure that files are downloaded within the 24-hour timeout
• Add the extra encryption to make management happy – The MPAA loves encryption
– Management loves encryption
– AWS automatically encrypts files at rest in Glacier
Tips & Tricks
• Checksum files before you upload – Save MD5 checksum
– Check that file hasn’t already been uploaded to Glacier
– Avoid file duplication
• Track who requests downloads to manage costs – Fee associated with each download
– Keep employees honest
Please give us your feedback on this
presentation
As a thank you, we will select prize
winners daily for completed surveys!
MED301