Upload
others
View
18
Download
0
Embed Size (px)
Citation preview
Building a big IaaS cloud with Apache CloudStackDavid Nalley
PMC Member Apache CloudStack
Member, Apache Software Foundation
Twitter: @ke4qqq
New slides at: http://s.apache.org/bigiaas
#whoami
• Apache Software Foundation Member
• Apache CloudStack PMC Member
• Recovering Sysadmin
• Fedora Project Contributor
• Zenoss contributor
• Employed by Citrix in the Open Source Business Office
My questions for you.
AgendaJustification - 'so what'
Overview of Apache CloudStack
Break
Design choices
Why use cloud?From a dev point of view the process looks like:
• Start new project
• File ticket for resources....wait....wait....wait
• Get resources, that aren't configured....wait...
• Get network access.....get permission....wait
• Get things done.
Why use cloud?
• What IT Ops provides is not what developers want.
• Does not maximize value for the business
Get rid of the waiting!● Remove constraints - developers empowered
to get things done. ● Agility● Enforce automated process instead of manual
ones
Orchestration/Automation● Sysadmins and network admins, still do so much
manually.
● IaaS does not solve all the problems.
Overview
Overview• CloudStack is an
open source Infrastructure-as –a-Service (IaaS) orchestration platform that enables users to build, manage and deploy compute cloud environments.
Overview - GUI
CloudStack offers an administrator's Web interface, used for provisioning and managing the cloud, as well as an end-user's Web interface, used for running VMs and managing VM templates.
Overview - API
CloudStack Web Services Query HTTP API is loosely based on the REST architecture and allows developers to create new management solutions or integrate existing systems with CloudStack. It returns both XML and JSON response formats.Documented at:
http://cloudstack.apache.org/docs/api
Overview – API - EC2
CloudStack also has a native but separate EC2 API Interface. Documented at:
http://cloudstack.apache.org/docs/api
Overview – API – Google Compute Engine
A few CloudStack developers created a lightweight GCE API translation layer as well. Currently a separate project. Downloadable from:
https://github.com/NOPping/gCloud
VM Provisioning Select Operating System• Windows, Linux
Select Compute Offering• CPU & RAM
Select Data Disk Offering• Volume Size
Select Network Offering• Network & Services
Create VM
DashboardVM Counts
Public IPs
Networks
Latest Events
Virtual Machine Management
Users
Start
Stop
Restart
Destroy
VM Operations Console Access
• CPU Utilized
• Network Read
• Network Writes
VM StatusChange
Service Offering
2 CPUs
1 GB RAM
20 GB
20 Mbps
4 CPUs
4 GB RAM
200 GB
100 Mbps
Volume Management
Add / DeleteVolumes Volume
Create Templates from Volumes
Volume Template
Schedule Snapshots
Hourly
Daily
Weekly
MonthlyNow
View Snapshot History 12/2/2012 7.30 am
….
2/2/2012 7.30 am
VM 1
Network Management
• Create Networks and attach VMs
• Acquire public IP address for NAT & load balancing
• Control traffic to VM using ingress and egress firewall rules
• Set up rules to load balance traffic between VMs
• Configure multi-tier networks
ManagementServer
MySQLCloud_db
Zone
Zone
Zone
SecondaryStorage
Management Server Deployments
InfrastructureResources
Back UpDB
Replication
MySQLDB
Management Server
Management Server
Load Balancer
User API
Admin API
• Primary Storage
– Cluster level storage for VMs
– Connected directly to hosts
– NFS, iSCSI, FC and Local
• Secondary Storage
– Zone level storage for template, ISOs and snapshots
– NFS or Object Store via CloudStack System VM
• Templates and ISOs
– Imported into CloudStack
– Can be private or public
Understanding the Role of Storage and Templates
Zone
Secondary Storage
Pod
Cluster
Host
HostPrimary Storage
Template
1. User Requests Instance
2. Provision Optional Network Services
3. Copy instance template from secondary storage to primary storage on appropriate cluster
4. Create any requested data volumes on primary storage for the cluster
5. Create instance
6. Start instance
Provisioning Process
Zone
Secondary Storage
Pod
Cluster
Host
HostPrimary Storage
VM
Template
System VMsSystem VMs optimize and scale the datapath on behalf of CloudStackStateless, can be destroyed and recreated from database state
Highly Available
Communicates with Management Server over management network
Usually have 3 interfaces: control, guest and public
System VMsVirtual Router VM
Provides multiple network services
IPAM (DHCP), DNS, NAT, Source NAT, Firewall, PF, VPN
User-data, Meta-data, SSH keys and password change server
Redundancy via VRRP
MS configures VR over SSH
Proxied via the hypervisor on XS and KVM
System VMsConsole Proxy VM
Provides AJAX-style HTTP-only console viewer
Grabs VNC output from hypervisor
Scales out (more spawned) as load increases
Java-based server Communicates with MS over message bus
System VMsSecondary Storage VM
Provides image (template) management services
Download from HTTP file share or Object Storage
Copy between zones
Scale out to handle multiple NFS mounts/Object Stores
Java-based server communicates with MS over message bus
Networking….is the bane of every cloud operators existence...
● Advanced
● Basic
● Everything else
Networking - Advanced● VLANs for isolation
● All VLANs in a Pod trunked to hypervisors
● Each account has a dedicated virtual router
● More services (VPN, Firewall, LB, etc.)
Networking - Basic● Simple, flat, Layer-2 network
● Bridge-based Layer-3 filtering/firewall
● Massively scalable
Networking – Everything else● GRE Tunnels
● VMware NSX (nee Nicira NVP)
● Midokura Midonet
● Stratosphere
● BigSwitch
● VXLAN (in a release this winter)
● Juniper Contrail (in a release this winter)
Installation● Add yum/apt repo
● yum -y install cloudstack-management
● cloudstack-setup-databases
● cloudstack-setup-management
● Configure......
Design Choices
Self service● UI
● API
● Some external tool
Self service● UI
● API
● Some external tool
API or Command-line � cloudmonkey> deploy virtualmachine
serviceofferingid=d8611d07-acf5-4cd4-a630-5c4d937ef043 templateid=081358ff-2427-44f8-adcc-1bb002fab361 zoneid=d06193b2-7980-4ad1-bd8-7b2f2eda63c3
curl 'http://localhost:8096/client/api?command=listUsers'
Config Management deployment
{ "name": "hadoop_cluster_a", "description": "A small hadoop cluster with hbase", "version": "1.0", "environment": "production", "servers": [ { "name": "zookeeper-a, zookeeper-b, zookeeper-c", "description": "Zookeeper nodes", "template": "rhel-5.6-base", "service": "small", "port_rules": "2181", "run_list": "role[cluster_a], role[zookeeper_server]", "actions": [ { "knife_ssh": ["role:zookeeper_server", "sudo chef-client"] } ] }, { "name": "hadoop-master", "description": "Hadoop master node", "template": "rhel-5.6-base", "service": "large", "networks": "app-net, storage-net", "port_rules": "50070, 50030, 60010", "run_list": "role[cluster_a], role[hadoop_master], role[hbase_master]" }, { "name": "hadoop-worker-a hadoop-worker-b hadoop-worker-c", "description": "Hadoop worker nodes", "template": "rhel-5.6-base", "service": "medium", "port_rules": "50075, 50060, 60030", "run_list": "role[cluster_a], role[hadoop_worker], role[hbase_regionserver]", "actions": [ { "knife_ssh": ["role:hadoop_master", "sudo chef-client"] }, { "http_request": "http://${hadoop-master}:50070/index.jsp" } ] }}
Use a tool
Usage● Jevons Paradox
● Plenty of waste possible as well - will developers always destroy a machine when they are done with it?
● Important to show what projects and groups are consuming resources as well as how they are using those resources
Storage● Commodity storage – if you can get away with it.
● Local storage tends to be the best mix of cheap and performant
● No failover - do you need it? If so, use something enterprise-y.
Commodity Networking● Layer 3 isolation - (aka Security Groups)
● VLANs - (not as commodity, but still relatively cheap on a small scale, but not at a large scale)
● Virtual routers (provide DHCP, DNS, LB, Firewall, PF, NAT, etc)
Commodity Hypervisor● If your scale is below 100 hypervisors – use what you
know – and if you don't know use KVM
● If you have more than 100 hypervisors you should be seriously evaluating XenServer – there's a reason Amazon, Rackspace, and Google use Xen-based hypervisors.
● Use VMware if you already know it. Or have some demands.
● Easy to mix and match if necessary.
Limiting Resources● Limit the number of VMs, snapshots, IP addresses, etc.
● Use 'projects' to share resources● This means most folks will never have
problems, but heaviest users will not be able to interrupt service for others.
Questions
Resources● http://cloudstack.apache.org● #cloudstack on irc.freenode.net● http://cloudstack.apache.org/docs