Real-world Cloud HPC at Scale, for Production Workloads (BDT212) | AWS re:Invent 2013

  • View

  • Download

Embed Size (px)


"Running high-performance scientific and engineering applications is challenging no matter where you do it. Join IT executives from Hitachi Global Storage Technology, The Aerospace Corporation, Novartis, and Cycle Computing and learn how they have used the AWS cloud to deploy mission-critical HPC workloads. Cycle Computing leads the session on how organizations of any scale can run HPC workloads on AWS. Hitachi Global Storage Technology discusses experiences using the cloud to create next-generation hard drives. The Aerospace Corporation provides perspectives on running MPI and other simulations, and offer insights into considerations like security while running rocket science on the cloud. Novartis Institutes for Biomedical Research talks about a scientific computing environment to do performance benchmark workloads and large HPC clusters, including a 30,000-core environment for research in the fight against cancer, using the Cancer Genome Atlas (TCGA)."

Text of Real-world Cloud HPC at Scale, for Production Workloads (BDT212) | AWS re:Invent 2013

  • 1. Real-world Cloud HPC at Scale, for Production Workloads Jason A Stowe, Cycle Computing November 15, 2013 2013, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of, Inc.
  • 2. We believe that utility access to HPC accelerates invention
  • 3. Goals for today See real world use cases from 3 leading engineering and scientific computing users Steve Philpott, CIO, HGST, A Western Digital Company Bill E. Williams, Director, The Aerospace Corporation Michael Steeves, Sr. Systems Engineer, Novartis Understand the motivations, strategies, lessons learned in running HPC / Big Data workloads in the cloud See the varying scales and application types that run well, including a 1.21 PetaFLOPS environment
  • 4. Agenda Introduction Steve Philpott Journey into Cloud Bill Williams Cloud Computing @ Aerospace Michael Steeves Accelerating Science Spot, On-demand, & Other Production uses Questions and answers
  • 5. Journey to the Cloud Steve Phillpott CIO HGST, a Western Digital Company 2013, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of, Inc.
  • 6. Cloud & Datacenter Performance Enterprise Founded in 2003 through the combination of the hard drive businesses of IBM, the inventor of the hard drive, and HGST, Ltd PCIe Enterprise SSD (+3 acquisitions) SAS 10K & 15K HDDs Acquired by Western Digital in 2012 More than 4,200 active worldwide patents Headquartered in San Jose, California Approximately 41,000 employees worldwide Develops innovative, advanced hard disk drives, enterprise-class solid state drives, external storage solutions and services Ultrastar Capacity Enterprise 7200 RPM & CoolSpin HDDs Ultrastar & MegaScale DC Delivers intelligent storage devices that tightly integrate hardware and software to maximize solution performance 6
  • 7. Zero to Cloud in 6+ Month By 31 Oct 2013: Cloud eMail Microsoft Office365 April 2013 Cloud eMail archiving/eDiscovery External SingleSignOn (off VPN) Cloud File/Collaboration BOX Cloud CRM Integrated to save files in BOX CloudHigh Performance Computing (HPC) on Amazon AWS Cloud Big Data Platform on Amazon AWS 7
  • 8. Responding to the Changing Business Model Where is our business model headed? New Age of Innovation as a guide N=1 Focus on Individual Customer Experience R=G Resources are Global Implications Increase in strategic partnering Need for high level of flexibility Leveraging external expertise Use of the Cloud/SaaS aligns with Virtual Business Model: Variable cost model critically important Lightweight, scalable services Reduced up-front capital spend Accelerated provisioning Pay as you go 8
  • 9. Paradigm Shift: Consumerization of IT I have better technology at home Consumer Web A new paradigm in ease of use and reduced cost. Consumer web has been driven by a series of platforms and these platforms are household brand names today When we use these platforms, it continually amazes us how easy, how consistent these platforms work A new set of services: DRM to iTunes Yet, our workplace applications are cumbersome, costly, difficult to navigate and require extensive support Workday, 2009 9
  • 10. The Big Switch The Box has Disappeared The Transformation of Computing as we Know it. Physical to Virtual/Digital move Do you really care which computer processed your last Google search? Efficiency Do not waste a CPU cycle or a byte of memory. Building a 4-story building and only using the 1st floor Utility: IT as a Service - Plug it in and get it Where the electricity industry has gone, Computing is following Computing shift is almost invisible to the end-user DATA is the value to the Organization, not the where 1
  • 11. Enabling the Virtual Organization Reframing IT Away From Thinking of The App Business Intelligence and Analytics End-to-End Business Processes Enterprise Data Management New Computing Platforms Strategic Outsourcing Software as a Service (SaaS) New IT Organizational Structures: Support and Align to New Business Model 1 1
  • 12. Creating an Innovation Playground: Where to Start and How to Evolve IT Supports Business Strategy Executive Buy-In CEO, CIO, InfoSec, etc Reduce Cap-ex, Optimize DC usage Build Expertise Implement Outcome Defined Knowledge Play Learn Educate Team Involvement Conferences Vendor Briefing Expert Services Best Practices Experiment Team Approach Hands-on approach Understand the value proposition Understand constraints Migrate Migrate dev/test environments Migrate or launch new apps on the cloud Embrace success Showcase cost savings Build an enterprise cloud strategy Learn from each experience Expand accordingly Indentify app fit for cloud computing Define new processes Collaborate with other companies 12 Awareness Understanding Transition Commitment 12
  • 13. Multiple Opportunities to Leverage Amazon Web Services (AWS) AWS: >5x the compute capacity than next 14 providers combined Gartner, Aug 2013 Access to massive compute and storage Billed by the hour - only pay for what is used HGST Japan Research Lab: Using AWS for higher performance, lower cost, faster deployed solution vs. buying huge on-site cluster Develop AWS Competency Many Opportunities: In-house and commercial HPCs are cloud ready Provide Computing When Needed: Reduce capital investment & risk and increase flexibility Faster Response to Business Needs: Rapid prototyping to pilot new IT capabilities with PO Process ; setup users, allocate compute and storage in minutes, load apps and go AWS provide a great option for disaster recovery for our on-premise clusters and storage 13
  • 14. HGSTs Amazon HPC Platform Case 3: Lube depletion in TAR (2D heat profile) 1.E+07 (300,000 atoms) Atoms Dealing with Basic Molecular Simulation Large Scale Molecular Simulation for HDI Top view 1.E+06 (Lube molecules spreading onto COC) Case 3 5 ns Case 1 1.E+05 1 ns 5 ns Case 2 1.E+04 Relaxation time: 5 ns Relaxation time: < 1 ns 1.E+03 0 100 200 300 400 Number of Core 500 600 Heat spot in TAR 36 nm Molecular Dynamics Simulation Read / Write Magnetics Electo Magnetic Fields Mechanical MAGLAND Simulation Application CST Read / Write Magnetics Electo Magnetic Fields Base HPC Platform Scalable to thousands of instances to support numerous simultaneous simulations Ansys Commercial LLG Ansys HFSS Pre- and Post-Processing Server Farms New G2 Instances Add Visualization Capabilities 14
  • 15. Big Datas 3 Vs Three Vs of Big Data Best pragmatic Volume Velocity Data sources Data types Applications Trends Variety Data collected Analysis & metadata creation Data acquisition Analysis & action Structured Terabytes Batch Unstructured, Semi-Structured & Structured Petabytes & Exabytes Real-Time & Streaming Implications & Opportunities Hardware and software optimization Architectural shifts: Scale-out systems, Distributed filesystems, Tiered storage, Hadoop Key difference: data structure does not need to be defined before loading definition from Snijders et al. Data sets so large and complex that they become awkward to work with using standard tools and techniques 15
  • 16. Data Sources Big Data Platform All raw parametric, logistic, vintage, data Parallelized batch analytics raw extracts Batch Analytics Enriched data Slider Wafer Media Substrate Optimize/Reduce Testing End-to-End Integrated Data . . . SAP/DWs App-Specific Views Failure S