Upload
lodeblomme
View
2.076
Download
0
Embed Size (px)
DESCRIPTION
Amazon web services in the cloud computing landscape
Citation preview
© 2011 Accenture. All rights reserved. Accenture, its logo, and Accenture “High performance. Delivered.” are trademarks of Accenture.61
77
Amazon Web Services
In the Cloud Computing Landscape
AS&T - Cloud Computing
Copyright © 2011 Accenture All Rights Reserved 2
Who am I ?
Lode Blomme
Work• Accenture since August 2011
• Technology Architecture Consultant
Social Media• Twitter : @lodeblomme
• LinkedIn : http://linkedin.com/in/lodeblomme
Keywords• architecture – cloud computing – photography – PHP – web 2.0 – web services
Project Context
Company Overview• Small startup company
• Community website about outdoor navigation
• Web services for other outdoor navigation websites
• Active in Western Europe
Attention Points• Agility is important
• No large capital for investments
Scalability• Alot of traffic in summer (avg 25k visits / day)
• Alot less traffic in winter (avg 5k visits / day)
• Alot of traffic during the day
• Alot less traffic during the night
3Copyright © 2011 Accenture All Rights Reserved
4Copyright © 2011 Accenture All Rights Reserved
TECHNOLOGY FOCUS
Mirror mirror on the wall, what is the best technology of them all ?
5
Cloud File Storage Comparison
• name: S3
• technology: proprietary
• physical locations: US East, US West, Ireland, Singapore, Tokyo
• name: Cloud Files
• technology: OpenStack
• physical locations: US & UK
Copyright © 2011 Accenture All Rights Reserved
Cloud File Storage Pricing
6
0 100 200 300 400 5000
0.020.040.060.080.1
0.120.140.160.180.2
AWS S3 Rackspace
0 1000 2000 3000 4000 5000 60000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
AWS S3 Rackspace
Storage ($ / TB) Data Transfer ($ / TB)
Copyright © 2011 Accenture All Rights Reserved
7
Cloud Servers Comparison
• name: EC2
• billing: hourly
• stop server: yes (thx to EBS)
• storage size: independent of machine power (thx to EBS)
• technology: Xen
• interface: UI or API
• physical locations: US East, US West, Ireland, Singapore, Tokyo
• name: Cloud Servers
• billing: hourly
• stop server: no
• storage size: linked to machine power
• technology: Xen
• interface: UI or API
• Physical locations: US & UK
Copyright © 2011 Accenture All Rights Reserved
Copyright © 2011 Accenture All Rights Reserved 8
Technologies Used
Amazon S3 (Cloud File Storage)
Amazon EC2 (Cloud Servers) + EBS + Elastic IP• Ubuntu Linux 8.04 – 11.04
• Apache Web Server 2.2
• NginX 0.5 – 0.8
• PHP 5.2 – 5.3
• PostgreSQL 8.2 – 8.4
Amazon RDS• MySQL 5.1 – 5.5
Dedicated Servers• Same as Amazon EC2
Virtual Host
Amazon EC2 Machine Image (AMI)
9
S3HD
AMI
virtual disk
VM
Copyright © 2011 Accenture All Rights Reserved
Copyright © 2011 Accenture All Rights Reserved 10
Amazon EC2 Ephemeral Storage
Root disk +• Micro Instance : none
• Small Instance : 160 GB
• Large Instance : 850 GB
• Extra Large Instance : 1,690 GB
• High-Memory Extra Large Instance : 420 GB
• High-Memory Double Extra Large Instance : 850 GB
• High-Memory Quadruple Extra Large Instance : 1690 GB
• High-CPU Medium Instance : 350 GB
• High-CPU Extra Large Instance : 1690 GB
• Cluster Compute Quadruple Extra Large Instance : 1690 GB
• Cluster Compute Eight Extra Large Instance : 3370 GB
• Cluster GPU Quadruple Extra Large Instance : 1690 GB
VM
Height Information Web Service
11
S3Virtual HD
Copyright © 2011 Accenture All Rights Reserved
Scalable Height Information Web Service
12
S3
VM
VM
VM
VM
VM
VM
Copyright © 2011 Accenture All Rights Reserved
EC2 Elastic Block Store (EBS)
= Virtual disk
~ SAN
• Persistant
• Variable size
• Attach to VM
• Improve performance with RAID
• No super performance
13Copyright © 2011 Accenture All Rights Reserved
Virtual Host
EC2 EBS AMI
14
EBSHD
AMI
virtual diskVM
Copyright © 2011 Accenture All Rights Reserved
Static IP
15
server 1
server 2DNS
client
Copyright © 2011 Accenture All Rights Reserved
EC2 Elastic IP
16
server 1
server 2DNS
client Elastic IP
Copyright © 2011 Accenture All Rights Reserved
17
AMAZON RDS
Amazon Relational Database Service
Copyright © 2011 Accenture All Rights Reserved
Easily Launch MySQL & Oracle Databases
18Copyright © 2011 Accenture All Rights Reserved
Easy Multi-AZ Deployment
19Copyright © 2011 Accenture All Rights Reserved
Easy Read Replica Creation
20Copyright © 2011 Accenture All Rights Reserved
Why Amazon RDS
Pros :• Automatic software upgrades
• Automatic backups
• Create new RDS instance from any point in time backup
• Multi-AZ deployment
• Easy read replica creation
Cons :• More expensive than running MySQL on EC2 yourself
21Copyright © 2011 Accenture All Rights Reserved
Lessons Learned
Pro• No traffic cost between S3 and EC2 when in same region
• High speed Amazon network when in same region
• No time to wait for hardware
• Easy to clone an existing running server
• Easy to add/remove storage
• Easy to replace a server without downtime
Con• Pay extra for support
• Disk I/O is not top (Ephemeral Storage is faster than EBS)
22Copyright © 2011 Accenture All Rights Reserved
Project Numbers
Peak number of instances :• 2 RDS MySQL databases
• 3 EC2 instances running Memcached
• 3 EC2 instances running PostgreSQL database
• 10 EC2 instances running Apache & PHP
Storage requirements :• +/- 75GB on S3
• +/- 1TB on EBS
23Copyright © 2011 Accenture All Rights Reserved
24
PRIVATE CLOUD
Your own Amazon EC2 and S3
Copyright © 2011 Accenture All Rights Reserved
Private Cloud Amazon EC2
Nimbula Director (http://nimbula.com/)• From the people behind Amazon EC2
• Uses KVM as hypervisor
• Runs on CentOS
Eucalyptus (http://www.eucalyptus.com/)• Open Source Software
• AWS Interface Compatibility
• Xen and KVM Hypervisor Support
OpenStack Compute (http://www.openstack.org/projects/compute/)• Open Source Software
25Copyright © 2011 Accenture All Rights Reserved
Private Cloud Amazon S3
AmpliStor (http://www.amplidata.com/)• Belgian Company
Gluster (http://www.gluster.org/)• Acquired by Red Hat
• Runs on CentOS
OpenStack Object Storage (http://www.openstack.org/projects/storage/)• Open Source Software
26Copyright © 2011 Accenture All Rights Reserved
27
Q&A
Or we can go and have a drink …
Copyright © 2011 Accenture All Rights Reserved
Amazon CloudWatch
• Monitoring for AWS cloud resources like :
• EC2 instances
• EBS volumes
• Elastic Load Balancers
• RDS DB instances
• SQS queues
• SNS topics
• Custom metrics generated by a customer’s applications and services.
• Programmatically retrieve your monitoring data
• View graphs
• Set alarms
28Copyright © 2011 Accenture All Rights Reserved
Auto Scaling
• Allows you to scale capacity up or down automatically according to conditions you define.
• Particularly well suited for applications that experience hourly, daily, or weekly variability in usage.
• Enabled by Amazon CloudWatch.
• No additional charge beyond Amazon CloudWatch fees.
29Copyright © 2011 Accenture All Rights Reserved
Application Deployment
• How to get your application running on newly started VMs?
• Number of servers changes constantly which makes deploying new versions hard.
• Create a Gold Image with OS and application if your application doesn’t change often.
• Create a system that bootstraps your VM when started. Use the same system for application updates :
• CloudInit package from Canonical
• Chef from Opscode
• Puppet from Puppet Labs
30Copyright © 2011 Accenture All Rights Reserved
31
AWS BEYOND IAASCopyright © 2011 Accenture All Rights Reserved
32
SIMPLEDB
Non-relational data store
Copyright © 2011 Accenture All Rights Reserved
What is SimpleDB
• Highly available, flexible, and scalable non-relational data store
• Automatically multiple geographically distributed copies of each data item
• Change data model on the fly
• Data is automatically indexed
• The Data Model: Domains, Items, Attributes and Values
• Consistency Options: Eventually Consistent Reads or Consistent Reads
33Copyright © 2011 Accenture All Rights Reserved
When to use SimpleDB
• Utilize index and query functions rather than more complex relational database functions
• Don’t want any administrative burden at all in managing their structured data
• Want a service that scales automatically up or down in response to demand, without user intervention
• Require the highest availability and can’t tolerate downtime for data backup or software maintenance
34Copyright © 2011 Accenture All Rights Reserved
35
AMZON RDS
Amazon Relational Database Service
Copyright © 2011 Accenture All Rights Reserved
When to use RDS
• Have existing or new applications, code, or tools that require a relational database
• Want native access to a MySQL or Oracle relational database, but prefer to offload the infrastructure management and database administration to AWS
• Like the flexibility of being able to scale their database compute and storage resources with an API call, and only pay for the infrastructure resources they actually consume
36Copyright © 2011 Accenture All Rights Reserved
What if SimpleDB and RDS don’t fit?
If you :• Wish to select from a wide variety of database engines
• Want to exert complete administrative control over their database server
You can always use one of the many relational database AMIs. Or you can start your own VM on EC2 and install your choice of database, the way you want it.
37Copyright © 2011 Accenture All Rights Reserved
38
ELASTICACHE
in-memory cache in the cloud
Copyright © 2011 Accenture All Rights Reserved
What is ElastiCache
• In-memory cache in the cloud
• Memcache on EC2 Memcached compatible
• Uses Amazon CloudWatch for monitoring
39Copyright © 2011 Accenture All Rights Reserved
Why ElastiCache
Pros :• Automatic failure detection and recovery
• No change needed in your application when adding/removing caching nodes
Cons :• More expensive than running Memcached on EC2 yourself
40Copyright © 2011 Accenture All Rights Reserved
41
ELASTIC MAPREDUCE
Process vast amounts of data
Copyright © 2011 Accenture All Rights Reserved
What is MapReduce?
MapReduce is a software framework introduced by Google in 2004 to support distributed computing on large data sets on clusters of computers.
42Copyright © 2011 Accenture All Rights Reserved
Say Again?!?
void map(String name, String document):
for each word w in document:
EmitIntermediate(w, "1");
void reduce(String word, Iterator partialCounts):
int sum = 0;
for each pc in partialCounts:
sum += ParseInt(pc);
Emit(word, AsString(sum));
43Copyright © 2011 Accenture All Rights Reserved
What, Why and How?
Software used :
• Usage scenarios: web indexing, data mining, log file analysis, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics research.
• Development :
• SQL-like languages, such as Hive and Pig
• Java, Ruby, Perl, Python, PHP, R, or C++
• Store input data and application logic in Amazon S3.
• Output data is stored in Amazon S3.
44Copyright © 2011 Accenture All Rights Reserved
Hadoop Ecosphere
Hive• Data warehouse system for Hadoop. Easy data summarization, ad-hoc queries,
and the analysis of large datasets stored in Hadoop compatible file systems. Query the data using a SQL-like language called HiveQL.
Pig• Platform for analyzing large data sets that consists of a high-level language for
expressing data analysis programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Karmasphere Studio• Graphical environment to develop, debug, deploy and monitor MapReduce jobs
from your desktop directly to Amazon Elastic MapReduce.
45Copyright © 2011 Accenture All Rights Reserved
46
AMAZON SQS
Amazon Simple Queue Service
Copyright © 2011 Accenture All Rights Reserved
What is Amazon SQS?
• Reliable, highly scalable, hosted queue for storing messages.
• Move data between distributed components without losing messages or requiring each component to be always available.
• Accessible through standards-based SOAP and Query interfaces.
• More info : http://aws.amazon.com/sqs/
47Copyright © 2011 Accenture All Rights Reserved
48
Amazon SQS Pro’s and Con’s
Pro• All messages are stored
redundantly across multiple servers and data centers.
• Designed to enable an unlimited number of computers to read and write an unlimited number of messages at any time.
Con• No guaranteed message order (no
FIFO, LIFO or priorities).
• Messages only available for max. 2 weeks.
• Messages can be delivered more than once.
• Max. 64 KB per message.
Copyright © 2011 Accenture All Rights Reserved
49
AMAZON SNS
Amazon Simple Notification Service
Copyright © 2011 Accenture All Rights Reserved
What is Amazon SNS
• Publish messages from an application and immediately deliver them to subscribers or other applications
• Delivers notifications to clients using a “push” mechanism that eliminates the need to periodically check or “poll” for new information and updates.
• Have messages delivered over clients’ protocol of choice:
• HTTP / HTTPS
• JSON Email
• Amazon SQS
50Copyright © 2011 Accenture All Rights Reserved
51
Amazon SNS Pro’s and Con’s
Pro• All messages are stored
redundantly across multiple servers and data centers.
• Designed to meet the needs of the largest and most demanding applications, allowing applications to publish an unlimited number of messages at any time.
Con• Messages can be delivered more
than once.
• Max. 8 KB per message
• Limit of 100 topics per AWS account.
Copyright © 2011 Accenture All Rights Reserved
Amazon SQS vs SNS
• Both messaging services within AWS
• SQS : used by distributed applications to exchange messages
• SNS : send time-critical messages to multiple subscribers
• SQS : polling model
• SNS : push mechanism
• SQS : send and receive messages without requiring each component to be concurrently available.
52Copyright © 2011 Accenture All Rights Reserved
Vendor Lock-in
• No problem when using plain EC2, it’s just a virtual server.
• EC2 additions (Elastic IP, Auto scaling) get you hooked to AWS.
• Some of Amazon PAAS and SAAS offerings restrict you to using AWS.
53Copyright © 2011 Accenture All Rights Reserved