OpenStack: The OpenSource Cloud’s Application in High Energy Physics

Preview:

DESCRIPTION

OpenStack: The OpenSource Cloud’s Application in High Energy Physics. That Title’s Overstated. OpenStack: The OpenSource Cloud’s Potential Application in Data Intensive Research. Not as Catchy. Caveats. I am not a storage or network engineer I am not a scientist. - PowerPoint PPT Presentation

Citation preview

OpenStack: The OpenSource Cloud’s Application in High Energy Physics

That Title’s Overstated

OpenStack: The OpenSource Cloud’s Potential Application in

Data Intensive Research

Not as Catchy...

Caveats» I am not a storage or network

engineer

» I am not a scientist

I am:» a Technical Product Manager.

» Dashboard Developer

» working for piston{cloud}computing

» Pragmatic.

» despite illusions of grandeur.

What is openstack?

» Founded by NASA and Rackspace

» The open source cloud computing platform

» Feature-rich and massively scalable

» Powers cloud storage, compute, and networking

» A world-wide open source collaboration

OpenStack as a Cloud OS

APPS

Creates Pools of Resources

Automates The Network

USERS

ADMINS

CLOUD OPERATING SYSTEMCLOUD OPERATING SYSTEM

Connects to apps via APIs

Self-service Portals for users

Benefits of OpenStack as a Common Platform

» Easy to migrate data and applications across clouds

Based on:» security policies» economics» research needs

» No vendor lock-in

» Common Layer of Data Exchange

» Less exposed to security issues than public cloud, but still interoperable.

3 Major OpenStack Components

» OpenStack Compute/Nova: provision and manage large networks of virtual machines

» OpenStack Object Store/Swift: Create petabytes of reliable storage using standard servers

» OpenStack Image Service/Glance: Catalog and manage large libraries of server images

+

» Other components: Dashboard, Load Balancing, Authentication...

Compute/Nova Key Features

2. Horizontally and massively scalable

1. REST-based API

3. Hardware agnostic: supports a variety of standard commodity hardware.

4. Hypervisor Agnostic: support for Xen, Citrix XenServer, Microsoft Hyper-V, KVM, UML, LXC and ESX

HOST 1 HOST 2 HOST 3 HOST 4, ETC.

VMs

Hypervisor:Turns 1 server into many “virtual machines” (instances or VMs)(VMWare ESX, Citrix XEN Server, KVM, Etc.)

» Hypervisors provide abstraction layer between apps and hardware (SERVERS)

» OpenStack pools servers, you run operating systems and applications on VMs instead of physical computers

Nova close up

» nova-api daemon» endpoint for all OpenStack or EC2 API queries

» nova-schedule process» takes a virtual machine instance request from

the queue and determines which compute server host it should run on

» a pluggable architecture allowing custom scheduling algorithm

» nova-compute process» worker daemon that creates and terminates

virtual machine instances

We mentioned Commodity.How Commodity?

Commodity Hardware

» Piston Silicon Mechanics» 2 Intel Xeon processors 5600 Series» 96GB of DDR3 RAM» 24TB of SATA storage» Redundant 1200W power supplies» 2U rackmount chassis

» That’s what our clients get, we’re on:» 32GB, 16TB, 2 Intel Xeon E5645

processorsDevOp borrowed the rest for other machines

Performance: 500 VM Spin Up» Assuming:

» 500 copies of one 8GM image» Image warm on the nodes» 50 VMs/Server

» Based on NASA’s experience in regular use, less than 30 seconds

» Worst case:» Image is still in Glance» VM has to be copied via HTTP

Image Service/Glance

2. REST-based API

1. Store & retrieve VM images

3. Compatible with all common image formats

4. Storage agnostic: Store images locally, or use

OpenStack Object Storage, HTTP, or S3

Storage/Swift Key Features

4. Scalable to multiple petabytes, billions of objects

1. REST-based API

6. Account/Container/Object structure (not file system, no nesting) plus Replication (N copies of accounts, containers, objects) 

5. No central database required

2. Data distributed evenly throughout system.

3. Runs on commodity hardware

The Storage Story: Nova» Nova/Compute has it’s own storage

» Block Storage or Nova-volume» an iSCSI solution» employs the use of Logical Volume

Manager (LVM) for Linux» intended for read/write purposes

(databases, log, etc.) » basically is an LVM/iSCSI

implementation to mount block devices in VM.

The Storage Story: Swift» Swift: Object Storage

» Fully Distributed» Commodity Hardware (Linux/x86)» Data Protection in Software» Not a File System» Not SAN/NAS/DAS... or any attached

storage» Optimized for Scale - Petabytes

Swift in Production

» Swift has been running in production at Rackspace for over a year with near 100% uptime.

» Rackspace’s swift clusters store billions of objects and petabytes of data.

» Internap, KT, SDSC, and HP are also running Swift in production

SwiftSwiftSwiftSwift

OS OS or or

EC2EC2APIAPI

OS OS or or

EC2EC2APIAPI

Sharing the Research

Location B

Location APrivate Cloud

Private Cloud

Common software platform making Federation possible, through a shared API.

To federate Swift across locations, you write a scheduler within OpenStack and drive it through the API.

Swift Components

Proxy Servers

Clients

Account Servers

Container Servers

Object Servers

Rings

Swift Components

» Proxy Server» Tie together the Swift architecture» Request routing» Exposes the public API

Swift Components

» The Ring: Maps names to entities (accounts, containers, objects) on disk.» Stores data based on zones, devices,

partitions, and replicas» Weights can be used to balance the

distribution of partitions» Used by the Proxy Server for many

background processes

Swift Components...

» Object Server:» Blob storage server» metadata kept in xattrs» data in binary format» Object location based on name &

timestamp hash

Swift & Large Object Storage

» default 5GB limit on the size of an uploaded object

» segmentation makes download size of a single object is virtually unlimited

» segments large object are uploaded and a special manifest file is created

» when downloaded, all segments are concatenated as a single object.

» greater upload speed » possible parallel uploads of segments.

But Wait, Swift...» Doesn’t load balance for often

requested objects.» throw Varnish Cache or Squid Proxy

in front of Swift

» Has a “simple” ReSTful API

» Wasn't intended for storing unknown data

» Isn’t searchable

» Is like Amazon’s S3

Potential Solutions for Those Needing to Search Data

» Or wait...» Swifts Blueprints Include Searchable

MetaData» https://blueprints.launchpad.net/swift/+s

pec/future-searchable-metadata» Contribute to the greater community

What’s Piston Doing Different?

» Piston Enterprise OS:

» A hardened cloud operating system built on OpenStack™

» Optimized for secure and easy operation of enterprise private clouds

» Fully supports interoperability with other OpenStack™ powered public and private cloud solutions.

{pentOS}TM features

{CloudKey}™

»Two-factor capable physical authentication

»Minimizes security risk of administrative logins

»Hands-free install in under 5 minutes

Null-Tier [Architecture]™

»Storage, compute and networking on every node

»Massively scalable

»Automated scaling

Top of Rack SwitchTop of Rack SwitchTop of Rack SwitchTop of Rack Switch

{pentOS}TM Null-Tier [Architecture]™

Server<1>-Networking-Storage-Compute-Management

Server<N>-Networking-Storage-Compute-Management

Highly available

{pentOS}controllers

Highly available Virtual

Machines

Highly available Virtual Storage

Hands-Free OS Install and Configuration

{CloudKey}™

Contact» Neil Johnston

» email: neil@pistoncloud.com» twitter: @neiljohnston

Or my co-authors:

» Joshua McKenty» email: josh@pistoncloud.com

» Christopher MacGown» email: chris@pistoncloud.com