Upload
john-sing
View
1.059
Download
0
Embed Size (px)
DESCRIPTION
To Infinity and Beyond - 2012 Internet Scale Workloads and Internet-scale Data Center Design: This Oct 2012 presentation given at the IBM Europe Symposium in Budapest, takes an advanced look at today's massive internet scale workloads and data centers, and dissects how/what lessons we can/should/must apply to our own IT shops. We'll examine how Internet Scale is very different than a collection of co-located servers - how these data centers respond to real-time, dynamic, fluid, competitive-advantage-leapfrog Internet business environments. These Internet-scale data center's servers, storage, software use new approaches to work as a end-to-end efficient, flexible, adaptable work flow. Using Google's definitive work on "The Data Center as A Computer - Intro to Warehouse-scale Machine" as a foundation (superb open license material by Google 2009), come discover the design, deployment, and lessons that we all must learn from these giants of the Internet. Why / How do they do what they do? Where are they being built? How are they powered and cooled? What are deployment form factors, design philosophies, power/cooling/packaging principles and trends, including modular portable container data center architecture? You'll come away knowing what you should apply to your own individual IT datacenter infrastructure in 2012 and beyond. My only request when using / referencing this material, is that you give full credit to me and IBM as the original authors of this research. That having been said, please spread the good word with good business judgement - we all benefit in today's modern global world.
Citation preview
© 2012 IBM Corporation
sGE06
To Infinity and Beyond:2012 Internet Scale Workloads and Data Center Design
John Sing, Executive Consultant, IBM Systems and Technology Group
IBM Systems Technical Universities– Budapest, Hungary – October 15-19
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
2
IBMTECHU.COM
IBM STG Technical Universities & Conferences web portal
Direct link: ibmtechu.com/hu
KEY FEATURES...
– Create a personal agenda using the agenda planner– View the agenda and agenda changes– Use the agenda search to find the sessions and/or – Download presentations– Submit Session and Conference Evaluations
Win prizes by submitting
evaluations online. The more evalutions
submitted, the greater chance of
winning
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
3
Evaluations are Online! Evaluations are Online! IBMTECHU.COM
sGE06
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
John Sing 31 years of experience with IBM in high end servers, storage, and software– 2009 - Present: IBM Executive Strategy Consultant: IT Strategy and Planning, Enterprise
Large Scale Storage, Internet Scale Workloads and Data Center Design, Big Data Analytics, HA/DR/BC
– 2002-2008: IBM IT Data Center Strategy, Large Scale Systems, Business Continuity, HA/DR/BC, IBM Storage
– 1998-2001: IBM Storage Subsystems Group - Enterprise Storage Server Marketing Manager, Planner for ESS Copy Services (FlashCopy, PPRC, XRC, Metro Mirror, Global Mirror)
– 1994-1998: IBM Hong Kong, IBM China Marketing Specialist for High-End Storage– 1989-1994: IBM USA Systems Center Specialist for High-End S/390 processors– 1982-1989: IBM USA Marketing Specialist for S/370, S/390 customers (including VSE
and VSE/ESA)
IBM colleagues may access my intranet webpage:– http://snjgsa.ibm.com/~singj/
You may follow my daily IT research blog– http://www.delicious.com/atsf_arizona
You may follow me on Slideshare.net:– http://www.slideshare.net/johnsing1
My LinkedIn:– http://www.linkedin.com/in/johnsing
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Agenda
Today’s Internet Scale Data Center Landscape
– Where are they? How big? How fast growing?
– What are they being used for? Cloud impact?
– Why understand them?
What is internet data center / warehouse-scale computing?
– How is it different? Workloads? – Hardware and software? – How the same?
How best to meld with it / use it / exploit?– Lessons we can apply from Internet scale
computing to traditional IT– Resources to help you on this journey
This session is the author’s research compilation, Great thanks to Google for the seminal work on which this lecture is based.Download a copy at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Published in 2009
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Today’s Internet Scale Data CenterLandscape
Paraphrased:
“The world has changed. And some things that should not be forgotten, may be lost”.
The Lord of the Rings, Galadriel
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Today: two different types of IT
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
Internet scale wkloadsTransactional IT
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Today’s two major IT workload types
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Transactional IT Internet scale wkloads
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
How to build these two different clouds
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
Transactional ITInternet scale wkloads
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
What You (Consumer) Get with These Clouds:
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Policy-based Clouds and Design-for-fail Clouds are purpose optimized Infrastructure Management solutions
Policy-based Clouds
• Purpose optimized for longer-lived virtual machines managed by Server Administrator
• Centralizes enterprise server virtualization administration tasks
• High degree of flexibility designed to accommodate virtualization all workloads
• Significant focus on managing availability and QoS for long-lived workloads with level of isolation
• Characteristics derived from exploiting enterprise class hardware
• Legacy applications
Design-for-fail Clouds
• Purpose optimized for shorter-term virtual machines managed via end-user or automated process
• Decentralized control, embraces eventual consistency, focus on making “good enough” decisions
• High degree of standardization
• Significant focus on ensuring availability of control plane
• Characteristics driven by software
• New applications
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
What’s happening?
Continually rising worldwide internet bandwidth
• Cisco global IP traffic study and forecast• http://www.akamai.com/stateoftheinternet/
Has given rise to pervasive and hyper-growing web services delivery model
– (i.e. “The Cloud”)
The Cloud is provided by data centers with massive amounts of well-connected processors, storage, network
– Amortized across internet scale user population– Across multiple workloads
http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/VNI_Hyperconnectivity_WP.html
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Bandwidth and the Cloud…..
This new class of large-scale internet and cloud data centers
Has data volume: – 10s / 100s petabytes
Servers: – 100,000s
Workload can’t fit– In single server / rack of servers
Workload:– Requires server clusters of 100s, 1000s,
10,000s, or more…….
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Growth ofThe Cloudby 2014
Means very big shift in resources
And in the way that IT is managed for the enterprise
http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns1175/Cloud_Index_White_Paper.html
Source:
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
How Big is the World? - 1
http://wikibon.org/blog/how-big-is-the-world-of-cloud-computing-infographic/
This is significant
Cheaper7.1x5.7x7.3x
NetworkStorageAdmins
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
How Big is the World? - 2
http://wikibon.org/blog/how-big-is-the-world-of-cloud-computing-infographic/
We’re going to talk about this
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Warehouse Scale Computers
The name for this emerging class of data centers is: – Warehouse-Scale Computer (WSC)
Large portions of hardware and software resource must work together
Only achieved by holistic approach to their design and development
Treat the datacenter itself as one massive computer
Enclosure for this computer looks like a building or warehouse
This session is the author’s research compilation, Great thanks to Google for the seminal work on which this lecture is based.Download a copy at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Here in plain English is the fundamentals of the next-generation IT age
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Tell me more. How does this compare to my existing data
centers?
What are different workloads that best fit into the two different
types?
How best to meet / meld / jointly profit ?
OK. Hmmmmmmm……….
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Big Positioning picture
TraditionalIT
Sto
rag
e r
eq’d
: G
B,
TB
, P
B
$ /
serv
er
DataWarehouse
TraditionalIT
DataWarehouse
CurrentIT
architectures
Growth areas
Mobile, Cloud
Growth areas
Mobile, Cloud
BigData
Internetscale
BigData
Internetscale
Current IT architectures
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Build new, different skills sets
TraditionalIT
Sto
rag
e r
eq’d
: G
B,
TB
, P
B $ /
serv
er,
sto
rag
e
DataWarehouse
BigData
Internetscale
TraditionalIT
DataWarehouse
BigData
Internetscale
Current IT architectures
Traditional IT workloads
Current ITarchitectures
Highly parallelized internet scale architecture
Integrated E2E software centric
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Key strategies
$ /
serv
er,
sto
rag
e
TraditionalIT
DataWarehouse
BigData
Internetscale
Current ITarchitectures Traditional IT
architectures
Internet scale architectures
Continue modernize current traditional IT …
Architect new-gen
connectors, skills Architect future
expandability
Connect with– New generation
mobile-enabled workloads
View new gen as powerful
partner
Enable them to view traditional IT as powerful
enabler
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Why Warehouse Scale Computers (WSC) might matter to you
While WSCs / Cloud Data Centers might today be considered a niche area
– Their sheer size / cost / architecture is no longer uncommon
– Among large internet companies and cloud co-lo’s
Problems solved by today’s huge Internet-scale IT design-for-fail architectures
– Have already become meaningful to a much larger constituency
Many organizations are already: – Exploiting similarly architected computers, at a much lower
cost - Hadoop is an example– Soon, we may have 2000+ cores in a single server
The experience learned building today’s Internet Scale Data centers
– Is useful in preparing your team / company to meld, interact, plan, grow, expand, exploit the future in your own best interest
– As these potentially ubiquitous next-generation machines and data centers take hold
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
What is:
Internet Scale Data Center?
Warehouse Scale Computer?
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Internet scale scale computing- what is different?
Traditional Data Center
Co-located machines share security, environmentals
Applications = a few binaries, running on relatively small number of machines
– 100s of inter-process relationships requiring 100 nanosecond response
Heterogenous hardware, software
Partitioned resources, managed and scheduled appropriately
Facility and computing equipment designed separately
Warehouse-scale computer (WSC)
Computing system designed to run massive internet services
Highly parallelized applications = 10s of binaries running on 1000s of machines
– 100,000s of independent tasks only requiring 100 microsecond response time (100x slower)
Homogeneous hardware, system software
Common pool of resources managed centrally
Integrated design of facility and computing machinery
This is a different thing, for a different workload type
Main difference
Credit for all these ideas is to Google 2011 June talk by Luis Andre Barroso at 2011 Federated Computing Research Conference San Jose, Calif
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Another way to tell them apart
Traditional Data Center
If your storage system has a few petabytes of data
Warehouse-scale computer
If your storage subsystem pages you in the middle of the night
Because you only have a few petabytes of free space left
Credit for all these ideas is to Google 2011 June talk by Luis Andre Barroso at 2011 Federated Computing Research Conference San Jose, Calif
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Let’s see some of largest Internet Scale Data Centers
Many are co-location Cloud data centers
Many are true Warehouse Scale Computers
All of them have a very specific Internet web services application profile
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
http://wikibon.org/blog/wp-content/uploads/2011/10/5-top-data-centers.html
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
http://wikibon.org/blog/wp-content/uploads/2011/10/5-top-data-centers.html
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Large Data Centers in past 2 years
10. SUPERNAP, LAS VEGAS, 407,000 SF
9A and 9B. MICROSOFT QUINCY AND SAN ANTONIO DATA CENTERS, 470,000 S
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Container Data Center Architecture 7. PHOENIX ONE, PHOENIX, ARIZ. 538,000 SF
5. MICROSOFT CHICAGO DATA CENTER, Chicago 700,000 SF 2. QTS METRO DATA CENTER, ATLANTA, 990,000 SF
Microsoft’s Chicago Container Data Center
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
More data centers….
4. NEXT GENERATION DATA EUROPE, WALES 750,000 SF
3. NAP OF THE AMERICAS, MIAMI, 750,000 SF
1. 350 EAST CERMAK, CHICAGO, 1.1 MILLION SQUARE FEET
Consumes 100 megawatts of power, 2nd-largest power customer for Commonwealth Edison, trailing only Chicago’s O’Hare Airport.
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
2012: Other large world data centers For Tulip Telecom, India, in Bangalore
Currently largest in AP and 3d largest in world (for now)
Nearly 1 M sq feet
Co-built with IBM
China to build 6.2 M sq feet data center by 2016
Amadeus, Erding, Germany1+ billion transactions / day.3 second response timeAccess to 95% of the worlds airline seats5000+ servers Powers over 260 websites in 110 countries for over 100 airlines10 PB of storage
Utah Data Center, US Govt, 1M sq feet
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Now….. what about the web giants?
i.e. Apple, Facebook, Google, Amazon, etc?
That’s Big!
Great Technology Wars of 2012 – Future of the Innovation Economy - Fast Company.com
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
AppleHere’s what powers iCloud, see Jobs at WWDC 2011 iCloud announce (YouTube)
Rendering of Apple's new North Carolina Data Center. Credit: Apple
Other Apple data centers:
Cork, IrelandMunich, GermanyNewark, CaliforniaCupertion, Calif
Apple Data Center
FAQ
Maiden, North Carolina 500K sq ftUSD $1B
This is phase 1 only
Apple Data Center Newark, California
Purposes for all these data centers:
•iCloud•Support Apple’s WW install base of devices•Futures: Move Content Delivery Network in-house?•Futures: Streaming video?
Under construction: Prineville, Oregon
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Facebook’s North Carolina Data Center Goes Live Lulea, Sweden - 290K sq ft (27K
sq meters) by late 2012
Facebook – Prinville, Oregon
Has spent $1B on it’s data centers
Open Compute Project
http://www.wired.com/wiredenterprise/2011/12/facebook-data-center/all/1
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Amazon
http://www.searchenginejournal.com/fathoming-amazon-a-visualization-of-their-success-infographic/36768/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Amazon Web Services
As of 1Q2012, AWS stores 905 billion objects and servers 650K requests/sec
Amazon Web Services 1Q12: 450,000 servers
Amazon Perdix Modular Datacenter
Amazon EC2 Cloud, with 17K core, 240 teraflop cluster is42nd fastest supercomputer in the world
450,000servers
905 billionobjects
650Kreq/sec
http://aws.typepad.com/aws/2012/04/amazon-s3-905-billion-objects-and-650000-requestssecond.html http://gigaom.com/cloud/how-big-is-amazon-web-services-bigger-than-a-billion/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
To understand the modern Internet scale data center……
Let’s study the creators of this new paradigm
Google originated and continues to drivemuch of this style of technology
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
What is Google? Google is not a search engine
Google is a real-time “Data Factory” ecosystem
– Defacto organizer of all human internet data
– Provides worldwide Patterns of Life data• Search, analytics, etc as processing• Interactive maps as visualization
– Android as ingest / output devices• Motorola Wireless acquisition $12B
– Supporting businesses and ecosystem roles:• Google+, Play, Shop, Books, Gmail, Docs• Voice recognition software
The history of search engine http://www.wordstream.com/articles/internet-search-engines-history
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Apple– Apple bought 12 PB for iTunes, iCloud– iPod = successful because of iTunes ecosystem– iPhone = successful because of App Store ecosystem
Facebook ecosystem– Patterns of life data on over 900 million users worldwide– Storage size of their Hadoop cluster: 30 PB
Amazon Web Services ecosystem
– Building 4 new modular datacenters: Oregon + Ireland– http://www.datacenterknowledge.com/archives/2011/03/28/amazons-cloud-goes-modular-in-oregon/– http://www.slideshare.net/AmazonWebServices/best-practices-in-architecting-for-the-cloud-webinar-jinesh-varia
eBay ecosystem– 2009: Analyzes 50PB of data a day, over 8 billion URL requests per day
Bottom line: ecosystem is no longer optional, hasn’t been for some time
Internet scale data centers… are “Data Factories with ecosystem”
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Google has already gone through three major end-to-end transformations
Google has 3 ages in terms of managing data:
Batch: Indexes calculated every month (2003)– Crawled the web 1x month. Built a search index, answered queries. Largely read-
only, pretty easy to scale. This is still how most people have in their minds about how Google works
Warehouse: the datacenter as one huge computer (2005) – Things move faster. The Internet has happened - pervasive, high speed,
interactive. – Building their own datacenters, more sophisticated at every level – Iconic systems like BigTable in production– At this time Google realized they were building something qualitatively different
than anything before, something we now think of as cloud computing. Amazon's EC2 launched in 2006
Instant: make it all real-time (2010)– Google's Colossus Makes Search Real-Time By dumping MapReduce– 3 Billion Writes and 20 Billion Read Transactions daily
Reflectsthe shift to
mobile devices and computing
http://www.google.com/insidesearch/features/instant/about.html
http://highscalability.com/blog/2011/8/29/the-three-ages-of-google-batch-warehouse-instant.html
The history of search engine http://www.wordstream.com/articles/internet-search-engines-history
http://highscalability.com/blog/2010/9/11/googles-colossus-makes-search-real-time-by-dumping-mapreduce.html
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Why Google Instant? It was part of the smartphone explosion of value across Internet….
In 2011, every 5 minutes = 250 hours of YouTube video uploads
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
You’ve noticed Google Instant:
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Architectural Guiding Principles
ForInternet Scale Servers in
Big Data companies
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Let’s examine the infrastructure
Looking for lessons
Hint: what is an Internet workload?
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Internet Scale Workload Characteristics - 1
Embarrassingly parallel Internet workload
– Immense data sets, but relatively independent records being processed• Example: billions of web pages, billions of log / cookie / click entries
– Web requests from different users essentially independent of each over• Creating natural units of data partitioning and concurrency• Lends itself well to cluster-level scheduling / load-balancing
– Independence = peak server performance not important– What’s important is aggregate throughput of 100,000s of servers
i.e. Very low inter-process
communication
Workload Churn
– Well-defined, stable high level API’s (i.e. simple URLs)– Software release cycles on the order of every couple of weeks
• Means Google’s entire core of search services rewritten in 2 years– Great for rapid innovation
• Expect significant software re-writes to fix problems ongoing basis– New products hyper-frequently emerge
• Often with workload-altering characteristics, example = YouTube
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Internet Scale Workload Characteristics - 2
Platform Homogeneity– Single company owns, has technical capability, runs entire platform
end-to-end including an ecosystem– Most Web applications more homogeneous than traditional IT– With immense number of independent worldwide users
1% - 2% of all Internet requests
fail*
Users can’t tell difference between Internet down and
your system down
Hence 99% good enough
*The Data Center as a Computer: Introduction to Warehouse Scale Computing, p.81 Barroso, Holzle
http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Fault-free operation via application middleware– Some type of failure every few hours, including software bugs– All hidden from users by fault-tolerant middleware– Means hardware, software doesn’t have to be perfect
Immense scale: – Workload can’t be held within 1 server, or within max size tightly-clustered
memory-shared SMP– Requires clusters of 1000s, 10000s of servers with corresponding PBs
storage, network, power, cooling, software– Scale of compute power also makes possible apps such as Google Maps,
Google Translate, Amazon Web Services EC2, Facebook, etc.
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Server, storage architecture at internet scale
Internet scale server, storage architecture fundamental assumptions:
– Distributed aggregation of data
– Storage functionality is in software on the server
– Time to Market is everything• Breakage = “OK” if I can insulate that from user
– Affordability is everything– Use open source software where-ever possible
– Expect that something somewhere in infrastructure will always be broken
– Infrastructure is designed top-to-bottom to address this
All other criteria are driven off of these
Storage criteria:
Cost
Extreme:
- Scale- Parallelism- Performance- Real time-Time to Market
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
SERVER HARDWARE
RHEL 2.6.X PAE
RACK
INTERIOR NETWORK IPv6
GFS / GFS II
BigTable MapreduceBigTable
Chubby Lock
GOOGLE APP ENGINE
Python, Java, C++, Sawzall, other
DC
GOOGLE APPSSEARCH
INDEXCRAWLGMAIL...
Architecture
Python. Java. C++
Exterior Network
GWQ
1. Google File System Architecture – GFS II
2. Google Database - Bigtable
3. Google Computation - MapReduce
4. Google Scheduling - GWQ
To meet this workload, typical internet-scale software stack 2003 - 2008
The OS or HW doesn’t do any of the above
Reliability, redundancy all in the “application stack”
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Distributed Execution Overview in typical internet-scale workflow
UserProgram
Worker
Worker
Master
Worker
Worker
Worker
fork fork fork
assignmap
assignreduce
readlocalwrite
remoteread,sort
OutputFile 0
OutputFile 1
write
Split 0Split 1Split 2
Input Data
10s of thousands of servers
Technologies such as Hadoop and MapReduce
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Internet-scale IT infrastructure
Inp
ut
from
th
e I
nte
rnet
You
r cu
sto
mers
End Result:
Each red block is an inexpensive server = plenty of power for its
portion of workflow
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Warehouse Scale Computer programmer productivity framework example
Hadoop– Overall name of software
stack
HDFS– Hadoop Distributed File
System
MapReduce– Software compute framework
• Map = queries • Reduce=aggregates
answers
Hive– Hadoop-based data
warehouse
Pig– Hadoop-based language
Hbase– Non-relationship database
fast lookups
Flume– Populate Hadoop with data
Oozie– Workflow processing
system
Whirr– Libraries to spin up Hadoop
on Amazon EC2, Rackspace, etc.
Avro– Data serialization
Mahout– Data mining
Sqoop– Connectivity to non-Hadoop
data stores
BigTop– Packaging / interop of all
Hadoop components
http://wikibon.org/wiki/v/Big_Data:_Hadoop%2C_Business_Analytics_and_Beyond
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
So the real question is:
If we run these immense scale Internet-style workloads
And: – The Internet-sized workload is too large for even maximum size tightly-clustered
memory-shared SMP
– Therefore, workload runs on clusters of 1000s, 10000s of servers • With their corresponding PBs storage, network, power, cooling, software
Given this workload, what is most cost-effective hardware?
We compare many high-end servers vs. thousands of commodity servers
This is the REAL questionFor Internet Scale Data Centers
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
TPC-C Benchmark: High-end SMP vs. low-end PC-class server
Low-end server TPC-C is 4x less expensive If we exclude storage costs, low-end server advantage jumps to 12x cheaper. This is meaningful.
4xless
12xless
“The Data Center as a Computer: Introduction to Warehouse Scale Computing”, table 3-1, p. 32 Barroso, Holzle
http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
(from late
2007)
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Then, compare Execution Time Parallel Tasks at 3 levels of communication intensity
“The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 3-1, p.34 Barroso, Holzle
http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Traditional IT high inter-communication workload = high end SMP has high
inter-process overhead
So what would happen if we increased number
of nodes 130x?
Internet light intercommunicationworkload = small performance
degradation
Past 8 nodes, little additional penalty for increasing cluster size
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Performance difference: Internet workload high-end vs. low-end server
“The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 3-2, p.35 Barroso, Holzle
http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Note how quickly performance advantage of high-end SMP
diminishes as cluster size increases
At > 2000 cores, 512 low-end servers is within 5%
of 16 high-end servers
12x cost savings at 5% difference
Bottom line: whenever Internet workload is involved (which is too large for any single high-end server
cluster)we do need to think differently about it
That’s why commodity class servers used for
light communication Internet-scale workloads
Internet workload
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Bottom line for Internet Scale Workloads:
It makes sense to use consumer grade servers, storage
For Internet-style workloads at Internet scale
It makes sense to use high performance tightly coupled high-end servers
If your workload has high inter-process communication(typical of traditional IT applications)
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Therefore, internet workload purpose-built server…. with onboard UPS
Huh?
Why an onboard UPS?
We’ll examine that next
Energy in the form of a UPS on each server
is deployed
As part of strategy to address biggest data
center construction costs
Much more than power outage. Goal is support temporary > 100% power
provisioning in data center
To ride through renewable energy lulls (lack of wind, lack of
solar)
Credit for these ideas: Google 2011 June talk by Luis Andre Barroso at 2011 Federated Computing Research Conference San Jose, Calif
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Let’s now examine the
warehouse-level data center
design itself
Ask yourself:What’s biggest cost-savings element?
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Internet Scale data center power components…
Image courtesy of DLB Associates: D. Dyer, “Current trends/challenges in datacenter thermal management—a facilities perspective,”presentation at ITHERM, San Diego, CA, June 1, 2006.“The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 4-1, p.40 Barroso, Holzle
http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Breakdown of data center energy overheads
Image courtesy of ASHRAE “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 5-2, p.49 Barroso, Holzlehttp://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Chiller alone is 33% of the cost
UPS alone is 18% of
construction cost
Physical cooling, UPS dominates the electrical power cost
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
construction cost of Internet Scale Data Center is Power / Cooling
Facebook’s North Carolina Data Center Goes Live
Facebook: Lulea, Sweden - 290K sq ft (27K sq meters) by late 2012
Facebook – Prinville, Oregon
Has spent $1B on it’s data centers
Open Compute Project
? Reducing power profile reduces
construction cost
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Wow. Given that fact…..
Whose data centers are most power efficient?
Reducing power profile = lowers initial CAPEX SIGNIFICANTLY
Therefore, fundamental Internet Scale Data Center goal is:
Decrease Power Usage Effectiveness (PUE)
PUE =
http://gigaom.com/cloud/whose-data-centers-are-more-efficient-facebooks-or-googles/
Total Building Power consumed---------------------------------------------
IT power consumed
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Google claims its data centers use 50% less energy than competitors
Power Usage Effectiveness– PUE=1.14 means power overhead is
only 14%– Industry average is around 1.8
http://venturebeat.com/2012/03/26/google-data-centers-use-less-energy/
Industry average PUE is about 1.8
http://www.datacenterknowledge.com/archives/2011/05/10/uptime-institute-the-average-pue-is-1-8/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Container modular data center: solving the Power Density issue
Image courtesy of DLB Associates: D. Dyer, “Current trends/challenges in datacenter thermal management—a facilities perspective,”presentation at ITHERM, San Diego, CA, June 1, 2006.
“The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 4-2, p.42 Barroso, Holzlehttp://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Without specialized plenums or containerized enclosures, maximum power density of 150-200W / square foot
Due to limits to how much air regular fans can push
Data center can only operate a few minutes without cooling air
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Modular Data Center
Value isn’t just time to delivery / flexibility
It’s also Higher Power density = lower construction cost
http://www.youtube.com/watch?v=zRwPSFpLX8I
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
That’s why you see such a big modern push on Container Data Centers:
7. PHOENIX ONE, PHOENIX, ARIZ. 538,000 SF
5. MICROSOFT CHICAGO DATA CENTER, Chicago 700,000 SF 2. QTS METRO DATA CENTER, ATLANTA, 990,000 SF
Microsoft’s Chicago Container Data Center
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
State of the Modular Data Center
Cyrus One 1 million sq ft “Massively Modular” data center under construction in Phoenix, Arizona
I/O Modular Data Center Assembly line
http://www.datacenterknowledge.com/archives/2012/05/17/cyrusone-going-massively-modular-in-phoenix/
http://www.datacenterknowledge.com/archives/2012/02/06/the-state-of-the-modular-data-center/
http://www.datacenterknowledge.com/archives/2012/01/30/inside-ios-modular-data-center-assembly-line/
Mismatch between rapid workload churn vs. 10+ year data center lifespan = modular data center characteristics strategic possibilities for
new build data centers
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
So given all of this
How do I put it all together
In a Warehouse Scale Computer?
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Google’s Machinery as result of all these factors:
70
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Architectural view of Google server and storage hierarchy
71
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Clusters through the years
“Google” Circa 1997 (google.stanford.edu) Google (circa 1999)
72
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Google Data Center (Circa 2000)
Clusters through the years
Google (new data center 2001)
3 days later 73
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Recent Google Design
• In-house rack design• PC-class motherboards• Low-end storage and networking hardware• Linux• + in-house software
74
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Container Datacenter
75
Run container hotter than normal
human comfort temperature =
big cost savings
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Google Container Datacenter
76
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Google: The Dalles, Oregon internet scale data center
77Google Data Center – The Dalles, Oregon
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Google Data Centers
in 2008:
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Google Data Center CAPEX worldwide
Capital expenditures on datacenters:– 1Q12: USD$ 607M– 2011: USD$ 3.4B– 2010: USD$ 4.0B– 2009: USD$ 809M
Each data center between $200M and
$600M
The Dalles, Oregon
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
And that…. is what today’s Internet Scale Data Center looks like
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
What will a European version of Internet Scale Cloud look like?
Data protection situation still evolving
Europe is Europe– Languages, culture, currencies
Cloud adoption will be very different country to country
Regardless, interest is Hot, Hot, Hot
– 2012: 73% of companies moving to some sort of cloud
– 2012: 55% moving to a private cloud
I believe Europe will adopt best of what’s already been done elsewhere
– In a uniquely European flavor
http://gigaom.com/cloud/will-there-be-an-amazon-of-europe/ http://gigaom.com/cloud/ec-cloud-plan-addresses-data-protection-problem-sort-of/
http://gigaom.com/cloud/5-things-you-need-to-know-about-cloud-in-europe/
http://gigaom.com/europe/dont-worry-europe-youre-about-to-get-a-new-beginning/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Apply lessons from today to Traditional IT as best possible
Source: Egan Ford, IBM Distinguished Engineer, OpenStack presentation: http://xmission.com/~egan/cloud/Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Read all about it, Google published this information into the public domain in 2009
By Google:– Luiz Andre Barroso– Uri Holze
Available to all, free of charge
Download a copy at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Video of Luis giving one of these lectures: http://inst-tech.engin.umich.edu/leccap/view/cse-dls-08/4903
http://www.barroso.org/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Let’s review our plans
To meld / meet / build readiness
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
To successfully co-exist / thrive with new generation workloads
Understand the Internet Scale Data Center / workload environment / lessons– End to end discovery, monitoring,
operational automation– Differentiate between traditional IT
and internet-scale workloads• For these two categories, architect IT systems
accordingly– Essential role of power efficiency on
CAPEX for new data center costs
$ /
serv
er
TraditionalIT
DataWarehouse
BigData
Internetscale
Views new generation as a
powerful partner Traditional IT workloads
Internet scale warehouse computer
Views the traditional IT as
a powerful enabler
Understand and innovate using these principles within your environment:– Be viewed as a powerful partner and enabler
of these future directions– Architect now, how you wish to your platform,
people, and infrastructure to grow along these lines
– Begin evolving and building skills now
Review attached learning points
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Today, our users all have many non-traditional IT alternatives
Traditional IT:
Traditional IT platforms
Traditional IT vendors
Non-traditional alternatives: – The Cloud, the Developing World
What will the effect be on your IT infrastructure, skill
sets, and business models?
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Other observations
Think larger than technology
Watch the business models, learn and apply
My additional presentation: “ Disruptive Innovation in Modern IT World”
– http://www.slideshare.net/johnsing1/a-india-csii2012disruptiveinnovationinthemodernitworldv3plenarypresentation
Keeping up with it all:– Necessary today: first thing every day, 1 hour of
industry study, to keep up
– Then share via your own digital footprint• A job skill necessity for today’s world
– Social network, personal exploitation of modern smart devices and tools
– See appendix for resources
Endless possibilities!!
I believe you would know better than I where to apply yourself
$ /
ser
ver
TraditionalIT
DataWarehouse
BigData
Internetscale
Current IT
processor req’d linear
with workload
Internet scale, warehouse scale
computer
New gen workloads
Exascale datacenter
Massive parallelism Flexible system optimization
Workload Optimized
Systems
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Summary Computation is inevitably moving into Warehouse Scale Computing supporting The Cloud
– IT Architects, now and in near future, must be aware of and capable of exploiting Internet Data Center Design and Workload experiences to best design the systems of the future
– When the workload is true internet scale, will require the physical and economic mechanisms at play in a Warehouse Scale Computer
This session is the author’s research compilation, Great thanks to Google for the seminal work on which this lecture is based.Download a copy at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Final comments– While WSC’s at one level are simple: a few ten thousand servers and a
LAN……..
– In reality, building cost-effective massive warehouse scale computing platform with necessary reliability, programming productivity, energy cost effectiveness is as difficult and as exciting / stimulating opportunity as IT has ever seen.
– The authors of “Intro to Warehouse Scale Computing” hope that this information will stimulate IT staff and scientists to understand this new area
– In the years to come, our collective understanding / efforts will solve and expand the many fascinating problems and benefits to humanity, arising from warehouse scale computer systems.
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Evaluations and chart downloads are online
http://www.ibmtechu.com/hu
sGE06
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Together, let’s build a Smarter Planet
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Learning Points
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Learning Points - 1
Rising bandwidth worldwide enables web services delivery model (“The Cloud”)
The Cloud runs in massive data centers with well-connected commodity processors, storage
– With homogenous applications amortized across internet scale user population
These data centers are a different class of large-scale computing machines called:– Warehouse scale computers (WSC)– With huge PB data volumes– Running the easily parallelized, high workload churn, homogeneous platform, fault-
tolerant clustered software stack
Understanding this class of machines:– Important as multi-core processor chip advancement within just a few years– Will make even modest-sized computing systems– Approach the behavior of today’s Warehouse Scale Computer
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Learning points - 2
Building block of choice for WSC is:– Commodity server-class processors, consumer/enterprise grade disk drives,
Ethernet-based networking– Because the internet workload characteristics include easy parallelization
Fault-tolerant software stack mitigates continuous failure rate– Of 10,000s / 100,000s of hardware and software components in WSC– Programmer software stack provides the tools to cost-effectively, time-effectively
program this highly clustered environment– Redundancy in application-level software eliminates need for redundancy in OS or
storage
Software development differs from traditional IT:– Ample parallelism:
• Internet users have a high degree independence from each other– High workload churn:
• Release cycle measured in days and weeks – Platform homogeneity:
• Single organization owns / has technical capability / runs WSC end to end– Fault-tolerant software:
• Makes feasible continuous recovery mode operation of servers / components w/o user application impact
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Learning Points – 3 – Economics / Cost
80% of construction cost of data center due to amount of power and cooling required
Maximizing Power Usage Efficiency is therefore paramount– To reducing capital expenditure as well as operating expenditure– Target PUE = 1.2
Modular Container Data Center architecture is popular:– Mainly because it increases the Power Profile / Power Density– Which in turn significantly reduces the data center construction cost– In addition provides flexibility, much faster time to delivery– Finally, is important tool to help address mismatch between Internet-scale hyper-
workload churn and 10+ year data center lifespan– Modular Container Data Center architecture has considerable merit for any organization
with scale requirements
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Learning Points – 4 – Key Challenges / Opportunities Ahead
Rapidly changing workload– New applications / innovations gain popularity at very fast pace – Often exhibit disruptive workload characteristics (YouTube example)– WSC architecture, container data centers are best practices to cost-effectively best
position / adapt the organization– To disruptive business innovations over the 10+ year lifespan of physical WSC structure
Building balanced systems from imbalanced components– Multi-core processors continue to get faster, become more energy efficient– Memory, disk storage, networking gear not evolving at same pace in either performance
or energy efficiency– Research / innovation must shift to these subsystems else further increases in processor
power will not be able to provide further WSC improvement
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Learning Points – 5
Curbing energy cost– We must continue to find ways to ensure performance improvements are
accompanied by corresponding energy efficiency improvements– Otherwise scarce future energy budget will increasingly curb growth in computing
capabilities
Internet-style workloads– Future performance gains will be delivered by more multi-cores, not clock speed– Future large scale systems will continue to increasingly exhibit characteristics of
today’s Internet Scale Data Centers and Workloads
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Supplemental Resources
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Energy Proportionality
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Activity profile 5,000 Google servers over period of 6 months
“The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 5-4, p.55 Barroso, Holzlehttp://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Majority of time, server utilization is in 10-50% range
Obviously some opportunity to increase processor util. %
The real question: how much power / cooling did I have to pay for in this data center to run these idling servers?
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
SPECpower_ssj2008: traditional IT servers consume nearly 70% power even when idling!
“The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 5-3, p.53 Barroso, Holzlehttp://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Server consumes 68% of peak power
requirement when idling!
i.e. at a 30% utilization, we’re using 75% of max power
A lot!!!! Even though most of server time < 50% utiliz. I’m paying 70% energy cost
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Energy Proportionality
2011 servers have gotten a lot better
By closing this gap,construction costs of internet
scale data center feasible
This gap = excessive data center construction
cost
Enables warehouse
scale computer to be affordable
Credit for all these ideas is to Google 2011 June talk by Luis Andre Barroso at 2011 Federated Computing Research Conference San Jose, Calif
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
But to fully exploit Energy Proportionality…..
Rest of IT infrastructure needs to catch up
Servers today: 3x– Have improved greatly since 2008
But: – Currently little/no focus on energy proportionality in:
Networking equip: 1.2x
Storage: 1.3x– Hard to do because we’re spinning the disk drive constantly – Spinning drives -> flash?
Dynamic RangeBigger is better
Means uses nearly same power whether it’s idle or fully utilized
Affects data center construction costs
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Is Internet Scale = High Performance Computing (HPC)? No
HPC clusters:1. Recovery model: OK to pause entire job,
restart computation from checkpoint
2. Requires significant CPU supporting– large numbers of synchronized tasks – which intensely communicate
3. Typically single binary, single job on 1000s of nodes
4. CPUs may run at 100% for days/weeks
5. Building block of choice: high perf, high avail high-end SMPs with high shared memory interconnect bandwidth for intense inter-process communication
Warehouse scale computers:1. Recovery model: gracefully tolerate large
#s component faults– operating near-continuous recovery
state
2. Requires significant CPU but individual tasks less synchronized– Little or no inter-communications– Internet workload = ample parallelism
3. Diverse set of applications– Hyper-pace workload churn / release
cycles
4. CPU utilization varies, rarely 90% due to need to reserve capacity for Internet spikes or to cover failed cluster components
5. Building blocks of choice: commodity server-class machines with direct attached disk drives, Ethernet-based interconnect
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Resources for your Internet Scale Workload and Data
Centerjourney
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
How to get ahead and thrive in this new world?
2012: devote 1st hour of day to keeping current
–No longer optional
Establish power-knowledge digital footprint, intelligently sharing what you find
–Don’t email what you find (too much email already)
–Use social networking, social bookmarking, blogs, etc
Become a power user of your smartphone’s ecosystem
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
KeepingCurrent
UsingJohn Sing’sbookmarks
You may use me as one source of ‘who/what to follow’– Connect with me: http://www.linkedin.com/in/johnsing
External: my daily list of social bookmarks is: – http://delicious.com/atsf_arizona
IBM colleagues may see my IBM Intranet webpage:– http://snjgsa.ibm.com/~singj/
IBM Colleagues can see my IBM SONAS intranet web page
– http://snjgsa.ibm.com/~singj/public/sonas_index.html [email protected]
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
See video reviewing the Space-Time-Travel example by IBM Distinguished Engineer (SWG) on Big Data – superb insight into Big Data
http://gigaom.com/2011/03/23/jeff-jonas-ibm/
Jeff Jonas/Las Vegas/IBM
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Keeping current – More places to connect
Find out what your colleagues are doing
https://www.facebook.com/pages/IBM-NAS/156301741086498
https://www.facebook.com/IBMRedbooks
https://www.facebook.com/peopleforasmarterplanet
http://storagecommunity.org/
https://www.ibm.com/developerworks/mydeveloperworks/blogs/InsideSystemStorage/?lang=en
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Learn what’s out there: McKinsey Global Report on Big Data
A seminal work on this fast evolving technology, critically important technology.
While 153 pages long - if you understand the content of this presentation and realize that Big Data is insanely important to future IT viability skills....
This paper gives superb, concrete, well-substantiated ideas on what Big Data is being used for today, as we speak, to create the business models of the future
You may download a copy here:
http://www.mckinsey.com/mgi/publications/big_data/index.asp
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Forbes Sept 2011: Impact of Social Media on Corporate
http://www.forbes.com/sites/techonomy/2011/09/07/social-power-and-the-coming-corporate-revolution/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
http://www.theregister.co.uk/hardware/storage/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
http://gigaom.com/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
http://www.datacenterknowledge.com/special-report-the-worlds-largest-data-centers/
Develop your list of daily reading and updating…
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Here’s one view of world’s largest data center:
Questions:
Do you know where the largest data centers are?
Arew we tracking what they do, and why?
We could, we should!
http://www.datacenterknowledge.com/special-report-the-worlds-largest-data-centers/worlds-largest-data-center-350-e-cermak/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
http://www.datacenterknowledge.com/archives/2009/05/14/whos-got-the-most-web-servers/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
All about the Hadoop Distributed File System (open source)
http://hadoop.apache.org/hdfs/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
http://www.highscalability.com
Of particular interest is the “Real Life Architectures” tab
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Hugely important, keep inspiring ourselves – one of my favorites:
http://www.ted.com/ - superb world class non-profit dedicated to Ideas Worth Spreading in technology, entertainment, design
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
http://www.slideshare.net
Search on the topic that you’re researching
– Competitors in particular
Find fantastic number of downloadable presentations
– Some better than others, but quickly, you’ll learn to sift find great quality for yourself
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Recommend you download, read,this very informative IBM book
“Understanding Big Data” – Published April 2012– Free download– Well worth reading to understand components
of Big Data, and how to exploit
Part 1: The Big Deal about Big Data– Chapter 1 – What is Big Data? Hint: You’re a
Part of it Every Day– Chapter 2 – Why Big Data is Important– Chapter 3 – Why IBM for Big Data
Part II: Big Data: From the Technology Perspective
– Chapter 4 - All About Hadoop: The Big Data Lingo Chapter
– Chapter 5 – IBM InfoSphere Big Insights – Analytics for “At Rest” Big Data
– Chapter 6 – IBM InfoSphere Streams – Analytics for “In Motion” Big Data
http://public.dhe.ibm.com/common/ssi/ecm/en/iml14297usen/IML14297USEN.PDFDownload your free copy here
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Interested in reading more about Competitive Advantage Analytics-based applications? Easy-to-read pages in this IBM book:
Download it (3.8 MB Acrobat Reader file) at:
ftp://ftp.software.ibm.com/common/ssi/pm/bk/n/imm14055usen/IMM14055USEN.PDF
User Interface LayerDashboards, Mashups, Search,Ad hoc reporting, Spreadsheets
Analytic Process LayerReal-time computing and analysis, stream computing,
entity analytics, data mining, content management, text analytics, etc.
Infrastructure layerVirtualization, central end to end management, control,
data proximity, deployment on global virtual file server with geographically dispersed storage
Secu
rityau
tho
rization
Location ofcustomer
competitive advantage applications
This book defines everything you need to know about Competitive Advantage modern analytics applications. Interesting reading.
If you are needing a quick overview of modern Analytics IT capability, start on page 14, read through page 48.
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
IBM Smarter Planet Big Data website
http://www-03.ibm.com/systems/data/flash/smartercomputing/bigdata.html
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
IBM Software Group Big Data web site
http://www-01.ibm.com/software/data/bigdata/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Learning Points: Ten Big Data Realities
Here are the first ten points that I want you think about when you’re grokking big data:
Oracle is not big data
Big data is not traditional Relational Database Management System (RDBMS)
Big data is not a Exadata
Big data is not a EMC VMAX
Big data is not highly structured
Big data is not centralized
IT people are not driving big data initiatives
Big data is not a pipe dream – big data initiatives are adding consumer and business value today. Right now. Every second of every minute of every hour of every day.
Big data has meaning to the enterprise
Data is the next source of competitive advantage in the technology business.
Source: David Vellante, 1Q2011
Source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Learning points: What does Big Data mean to IT infrastructure professionals?
Big data means the amount of data you’re working with today will look trivial within five years.
Huge amounts of data will be kept longer and have way more value than today’s archived data.
Business people will covet a new breed of alpha geeks. You will need new skills around data science, new types of programming, more math and statistics skills and data hackers…lots of data hackers.
You are going to have to develop new techniques to access, secure, move, analyze, process, visualize and enhance data; in near real time.
You will be minimizing data movement wherever possible by moving function to the data instead of data to function. You will be leveraging or inventing specialized capabilities to do certain types of processing- e.g.early recognition of images or content types – so you can do some processing close to the head.
The cloud will become the compute and storage platform for big data which will be populated by mobile devices and social networks.
Metadata management will become increasingly important.
You will have opportunities to separate data from applications and create new data products.
You will need orders of magnitude cheaper infrastructure that emphasizes bandwidth, not IOPs - and data movement with efficient metadata management.
You will realize sooner or later that data and your ability to exploit it is going to change your business, social and personal life; permanently.
Source: David Vellante, 1Q2011
Source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
More information for my IBM colleagues - read transcript of Big Data Overview
http://snjgsa.ibm.com/~singj/public/2011_Big_Data_Modern_Analytics_Tutorial/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
More information for my IBM colleagues
Below is the 2012 IBM Research Global Technology Outlook:– https://w3-connections.ibm.com/wikis/home?lang=en_US#/wiki/Wd99c91e6c090_42d6_
bbef_095a93a1bc63
Below is the IBM Research Global Technology Outlook 2011 Overview which includes our first discussions of Big Data:
– http://snjgsa.ibm.com/~singj/public/2011_Prague_IBM_Systems_Conference/STG%20Tech%20Conference%20GTO%202011%20from%20Dr%20Matthias%20Kaiserwerth.ppt
See all the IBM IBM Research Global Technology Outlook 2011 charts at: – http://w3.ibm.com/articles/workingknowledge/2010/12/res_gto_2011.html
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Internet Scale Data Centers A different scale and set of server storage and facility
economics
Implies where our own strategies, skill sets, and architectures can expand:
– With additional styles of thinking, architecture– If you think 2012 is growing fast– Going to take off even more in 2013– Mucho resources in appendix to these charts
We are all at an inflection point
Growth areas
Traditional areas
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Thank You
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Appendix
Following are charts from Budapest session xCL01, from Egan Ford, IBM Distinguised Engineer, System X / STG Cloud Strategy
It is his research on this similar topic of Internet Scale Workloads and Data Center Design
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Today: two different types of IT
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
Internet scale wkloadsTransactional IT
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Today’s two major IT workload types
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Transactional IT Internet scale wkloads
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
How to build these two different clouds
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
Transactional ITInternet scale wkloads
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
What You (Consumer) Get with These Clouds:
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Policy-based Clouds and Design-for-fail Clouds are purpose optimized Infrastructure Management solutions
Policy-based Clouds
• Purpose optimized for longer-lived virtual machines managed by Server Administrator
• Centralizes enterprise server virtualization administration tasks
• High degree of flexibility designed to accommodate virtualization all workloads
• Significant focus on managing availability and QoS for long-lived workloads with level of isolation
• Characteristics derived from exploiting enterprise class hardware
• Legacy applications
Design-for-fail Clouds
• Purpose optimized for shorter-term virtual machines managed via end-user or automated process
• Decentralized control, embraces eventual consistency, focus on making “good enough” decisions
• High degree of standardization
• Significant focus on ensuring availability of control plane
• Characteristics driven by software
• New applications
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Some OpenStack Public Use Cases
• Internap– http://www.internap.com/press-release/internap-announces-world%E2%80%99s-first-
commercially-available-openstack-cloud-compute-service/
• Rackspace Cloud Servers, Powered by OpenStack– http://www.rackspace.com/blog/rackspace-cloud-servers-powered-by-openstack-beta/
• Deutsche Telekom– http://www.telekom.com/media/media-kits/104982
• AT&T– http://arstechnica.com/business/news/2012/01/att-joins-openstack-as-it-launches-cloud-
for-developers.ars
• MercadoLibre– http://openstack.org/user-stories/mercadolibre-inc/mercadolibre-s-bid-for-cloud-
automation/
• NeCTAR– http://nectar.org.au/
• San Diego Supercomputing Center– http://openstack.org/user-stories/sdsc/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
OpenStack design tenets focus on delivering essential infrastructure on an available, scalable, elastic control plane
•
Sources:http://www.openstack.org/downloads/openstack-compute-datasheet.pdfhttp://wiki.openstack.org/BasicDesignTenets
Basic Design Tenets
1) Scalability and elasticity are our main goals
2) Any feature that limits our main goals must be optional
3) Everything should be asynchronous. If you can't do something asynchronously, see #2
4) All required components must be horizontally scalable
5) Always use shared nothing architecture (SN) or sharding. If you can't Share nothing/shard, see #2
6) Distribute everything. Especially logic. Move logic to where state naturally exists.
7) Accept eventual consistency and use it where it is appropriate.
8) Test everything. We require tests with submitted code. (We will help you if you need it)
OpenStack Leadership's vision statement
“essential Infrastructure, support platform”
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
OpenStack
Source: http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
OpenStack is comprised of seven core projects that form a complete IaaS solution
Compute (Nova)
Storage (Cinder)
Network (Quantum)
Provision and manage virtual resources
Dashboard (Horizon)Self-service portal
Image (Glance)Catalog and manage server images
Identity (Keystone)Unified authentication, integrates with existing systems
Object Storage (Swift)petabytes of secure, reliable object storage
IaaS
Source: http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/
IaaS
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Compute delivers a fully featured, redundant, and scalable cloud computing platform
Architecture•
Sources:http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/http://openstack.org/projects/compute/
Key Capabilities:
•Manage virtualized server resources• CPU/Memory/Disk/Network Interfaces
•API with rate limiting and authentication
•Distributed and asynchronous architecture• Massively scalable and highly available system
•Live guest migration• Move running guests between physical hosts
•Live VM management (Instance)• Run, reboot, suspend, resize, terminate instances
•Security Groups
•Role Based Access Control (RBAC)• Ensure security by user, role and project
•Projects & Quotas
•VNC Proxy through web browser
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Compute management stack control plane is built on queue and database
Key Capabilities:
• Responsible for providing communications hub and managing data persistence
• RabbitMQ is default queue, MySQL DB– Documented HA methods– ZeroMQ implementation available to decentralize
queue
• Single “cell” (1 Queue, 1 Database) typically scales from 500 – 1000 physical machines
– Cells can be rolled up to support larger deployments
• Communications route through queue– API requests are validated and placed on queue– Workers listen to queues based on role or role +
hostname– Responses are dispatched back through queue
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
nova-compute manages individual hypervisors and compute nodes
Key Capabilities:
• Responsible for managing all interactions with individual endpoints providing compute resource, e.g.
•-- Attach iSCSI volume to phsyical host, map to guest as additional HDD
• Implementations direct to native hypervisor APIs– Avoids abstraction layers that bring least common
denomination support– Enables easier exploitation of hypervisor
differentiators
• Service instance runs on every physical compute node, helps to minimize failure domain
• Support for security groups that define firewall rules
• Support for– KVM– LXC– VMware ESX/ESXi (4.1 update 1)– Xen (XenServer 5.5, Xen Cloud Platform)– Hyper V
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
nova-scheduler allocates virtual resources to physical hardware
Key Capabilities:
• Determines which physical hardware to allocate to a virtual resource
• Default scheduler uses a series of filters to reduce set of applicable hosts and uses costing functions to provide Weight
• Not a focus point for OpenStack– Default implementation finds first fit– Shorter the workload lifespan, less critical the
placement decision
• If default does not work, often deployers have specific requirements and develop custom
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
nova-api supports multiple API implementations and is the entry point into the cloud
Key Capabilities:
• APIs supported– OpenStack Compute API (REST-based)
– Similar to RackSpace APIs– EC2 API (subset)
– Can be excluded– Admin API (nova-manage)
• Robust extensions mechanism to add new capabilities
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Network automates management of networks and attachments (network connectivity as a service)
Key Capabilities:
•Responsible for managing networks, ports, and attachments on infrastructure for virtual resources
•Create/delete tenant-specific L2 networks
•L3 support (Floating IPs, DHCP, routing)
•Moving to L4 and above in Grizzly
•Attach / Detach host to network
•Similar to dynamic VLAN support
•Support for– Open vSwitch– OpenFlow (NEC & Floodlight controllers)– Cisco Nexus– Niciria
Architecture
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Cinder manages block-based storage, enables persistent storage
Key Capabilities:
• Responsible for managing lifecycle of volumes and exposing for attachment
• Structure is a copy of Compute (Nova), sharing same characteristics and structure in API server, scheduler, etc.
• Enables additional attached persistent block storage to virtual machines
• Support for booting virtual machines from nova-volume backed storage
• Allows multiple volumes to be attached per virtual machine
• Supports following– ISCSI– RADOS block devices (e.g. Ceph distributed file
system)– Sheepdog– Zadara
Architecture
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Identity service offers unified, project-wide identity, token, service catalog, and policy service designed to integrate with existing systems
Key Capabilities:
• Identity service provides auth credential validation and data about Users, Tenants and Roles
• Token service validates and manages tokens used to authenticate requests after initial credential verification
• Catalog service provides an endpoint registry used for endpoint discovery.
• Policy service provides a rule-based authorization engine and the associated rule management interface.
• Each service configured to serve data from pluggable backend
– Key-Value, SQL, PAM, LDAP, PAM, Templates
• REST-based APIs
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Image service provides basic discovery, registration, and delivery services for virtual disk images
Key Capabilities:
• Think Image Registry, not Image Repository
• REST-based APIs
• Query for information on public and private disk images
• Register new disk images
• Disk images can be stored in and delivered from a variety of stores (e.g. SoNFS, Swift)
• Supported formats– Raw– Machine (a.k.a. AMI)– VHD (Hyper-V)– VDI (VirtualBox)– qcow2 (Qemu/KVM)– VMDK (VMWare)– OVF (VMWare, others)
Referenceshttp://openstack.org/projects/image-service/
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
Dashboard enables administrators and users to access and provision cloud-based resources through a self-service portal
Key Capabilities:
• Thin wrapper over APIs, no local state
• Registration pattern for applications to hook into
• Ships with three central dashboards, a “User Dashboard”, a “System Dashboard”, and a “Settings
• Out-of-the-box support for all core OpenStack projects– Nova, Glace, Switch, Quantum
• Anyone can add a new component as a “first-class citizen”.– Follow design and style guide.
• Visual and interaction paradigms are maintained throughout.
• Console Access
Referenceshttp://horizon.openstack.org/intro.html
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
OpenStack Resources
• Forums– http://forums.openstack.org/
• Wiki– http://wiki.openstack.org/
• Documentation– http://docs.openstack.org/
• Mailing Lists– http://wiki.openstack.org/MailingLists
• OpenStack Project Management– https://launchpad.net/openstack
• Blogs– http://planet.openstack.org
• Real-time chat room– #openstack and #openstack-dev on irc://freenode.net (443 users currently logged in)
• Rackspace Reference Architectures– http://www.referencearchitecture.org/
• Easy Install– http://www.hastexo.com/resources/docs/installing-openstack-essex-20121-ubuntu-1204-precise-
pangolin
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
IBM Resources/Solutions for OpenStack Available Today
• developerWorks– https://www.ibm.com/developerworks/mydeveloperworks/wikis/home?lang=en#/wiki/
OpenStack– Google: openstack IBM developerworks
• xCAT (FOSS) for 0-day deployment– xCAT OpenStack Paper (CATStack)– Automated qcow2 image creation for Glance– HW control– Bare-metal discovery and bring up
•Firmware, Base OS, etc…
• IBM Intelligent Cluster Solutions (see Matt Ziegler's PPT)– Preconfigured Switches– Rack and stacked and ready to go– Lab Services for 0-day
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
IBM Resources/Solutions for OpenStack Available Today
• All IBM System Software and Tools can coexist with OpenStack.– Director, ASU, lflash, etc…
• SoNAS for shared file (NFS, SMB)
• XIV for block storage (Nova Volume)
• iDPX for scale-out Nova Compute and Swift
• BNT switches for OpenFlow and Quantum
• GPFS for iSCSI/block (Nova Volume) or file.
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
OpenStack Demo Setup
10.0.9.10 10.0.9.11 10.0.9.12 10.0.9.13 10.0.9.X
172.20.249.10 172.20.249.11 172.20.249.12 172.20.249.13 172.20.249.X
os-essex0 os-essex1 os-essex2 os-essex3 os-essexX
Control Nodes Compute Nodes
Private Networks: eth0: 172.20.249/24 vm: 172.20.250/24
Public Networks: eth1: 10.0.9.0/25 vm: 10.0.9.128/25
computenetwork
computenetwork
computenetwork
computenetworkschedulervolumeconsoleglanceapi
computenetworkschedulervolumeconsoleglanceapi
Scale OutHA Active/Passive
VMVM
VMVM
VMVM
VMVM
VM Firewall
© 2012 IBM Corporation
IBM Systems Technical Universities– Budapest, Hungary – October 15-19sGE06
PPT’s and Videos: http://xmission.com/~egan/cloud/