32
Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael Zink, David Irwin, UMass Amherst Orran Krieger & Martin Herbordt, Boston University, Miriam Leeser & Peter Desnoyers, Northeastern University

Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Open Cloud Testbed: Developing a Testbed for the Research Community Exploring

Next-Generation Cloud Platforms

Michael Zink, David Irwin, UMass Amherst Orran Krieger & Martin Herbordt, Boston University,

Miriam Leeser & Peter Desnoyers, Northeastern University

Page 2: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

What is MGHPCC?

Page 3: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

What is Mass Open Cloud?

• Vision Statement“To create a self-sustaining at-scale public cloud based on the Open Cloud eXchange model… a marketplace for industry partners as well as a place for researchers and industry to innovate and expose innovation to users.”

• Project Overview and Goals– At-scale efficient production cloud for broad set of applications– Create and Deploy the OCX model– Testbed for research, open source developers, companies

Page 4: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Motivation I

● Cloud computing plays an important role in supporting most software we use in our daily lives

● Critical for enabling research into new cloud technologies (see demand for CloudLab and Chameleon)

● Demand for cloud testbeds often higher than available resources

Page 5: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Motivation II

● CISE researchers want to study users that are not CISE● MOC supports

○ real users○ access to real data sets○ can provide traces of real usage○ can allow services to be exposed to end-users (e.g., TTP)○ has access to production services at scale (e.g., NESE)○ infrastructure and services provided by industry partners

Page 6: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Research "in" the MOC

Page 7: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Motivation III

● CloudLab supports○ Large community (nationwide) of systems researchers○ Tools to configure experimental slices (a combination of bare

metal nodes and networking resources)○ Hard isolation from other users/experiments○ Profiles to describe hard- and software to build a cloud○ Designed specifically for reproducible research○ Software stack is open source○ Federation with other testbeds (e.g., GENI, FABRIC)

Page 8: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

● Scientific infrastructure for cloud research

● Three clusters (Utah, Wisconsin, Clemson, and MGHPCC), which offer 15,000+ cores○ Each cluster has a different focus: storage and networking (using hardware from

Cisco, Seagate, and HP), high-memory computing (Dell), and energy-efficient computing (HP).

Page 9: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

● A testbed for research and experimentation into new cloud platforms

● Combine proven software technologies and reproducibility features with a real production cloud

● Enhanced with programmable hardware (FPGA) capabilities; bump-in-the-wire (BITW); ~30 nodes

Open Cloud Testbed

Page 10: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

OCT Approach

Page 11: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

OCT Approach

Page 12: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

OCT Approach

Page 13: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Research "in" the MOC

Page 14: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

FPGAs in OCT

Page 15: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Research "in" the MOC

Page 16: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Open Cloud Testbed Concept

Page 17: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

What’s new?● FPGA’s as reconfigurable compute and Bump-in-the-Wire

(BITW) fully accessible by users

Page 18: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

What’s new?● FPGA’s as Bump-in-the-Wire (BITW) fully accessible by users● Make CloudLab dynamically scalable by adding and removing

third-party resources

Page 19: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

What’s new?● FPGA’s as Bump-in-the-Wire (BITW) fully accessible by users● Make CloudLab dynamically scalable by adding and removing

third-party resources● Transfer new cloud mechanisms to production cloud (MOC)

Page 20: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

What’s new?● FPGA’s as Bump-in-the-Wire (BITW) fully accessible by users● Make CloudLab dynamically scalable by adding and removing

third-party resources● Transfer new cloud mechanisms to production cloud (MOC)● Access to storage, data sets, cloud telemetry

Page 21: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

What’s new?● FPGA’s as Bump-in-the-Wire (BITW) fully accessible by users● Make CloudLab dynamically scalable by adding and removing

third-party resources● Transfer new cloud mechanisms to production cloud (MOC)● Access to storage, data sets, cloud telemetry● Collaboration with industry (e.g.,180 nodes from Two Sigma)

Page 22: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

What’s new?● FPGA’s as Bump-in-the-Wire (BITW) fully accessible by users● Make CloudLab dynamically scalable by adding and removing

third-party resources● Transfer new cloud mechanisms to production cloud (MOC)● Access to storage, data sets, cloud telemetry● Collaboration with industry (180 nodes from two sigma)● Usage of certain parts of CloudLab by users outside research

community

Page 23: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

New Hardware

● Original plan: Add 10 new nodes to existing Mass CloudLab cluster● Two Sigma donation of ~200, 2-year old servers to MOC made us change

plan:○ Mass CloudLab (19 additional R630 => 380 additional cores)

○ 3 additional racks of servers (R630, R620, and R720/R730):■ ~ 1600 cores■ Part of CloudLab■ Part of MOC■ Can be used by industry and foundations

Page 24: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Challenges

● FPGAs● Out-of-band management● Network isolation● Community buy-in

Page 25: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

FPGAs

● 15 in 2020 & 15 in 2022● Queried potential users of FPGAs (~20 beta users)

○ Cloud and Operating System

○ Middleware

○ FPGA systems

○ FPGA tools

○ Provider applications

○ Tenant applications

Page 26: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

FPGAs

● Will most likely start with two models:○ Xilinx Alveo U280○ Intel D5005

● Toolchain● Implications on networking:

○ Rate limit switch

Page 27: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Out-of-band Management

● Whoever controls OBM controls server● OBM proxy will manage control● OBM interfaces with ESI

ESI

Page 28: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

ESI Approach: long term● ESI controls access to servers and switches● CloudLab will have drivers for OBM, switch control, console, for servers

allocated● Scripts for admins to transfer nodes to/from CloudLab● Use Keylime for attesting nodes provided back to production:

○ MOC

○ NERC

○ HPC Clusters

● Eventually enable stateless CloudLab nodes - save and restore experiments

● Original plan was to do this; CloudLab not elastic until complete...

Page 29: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

ESI Approach: new short term● Give CloudLab & ESI control software direct access to servers and

switches● Each will have in its inventory all servers; build simple mechanism to

allocate to NULL project servers used by other side● Cons: is unsecure, credentials for server out-of-band management and

switch management have to be shared● Pros: straight forward; we will be able to have an elastic cloud lab mid

year; can incrementally add drivers for OBM, console, network...

Page 30: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Network Isolation

AL2S● Guaranteeing QoS in the network is hard● Options:

○ Traffic shape at the server (tricky to enforce)○ Rate limit ports at switches (what is the impact on TCP?)

● Overprovisioning helps but does not guarantee isolation

Page 31: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Community buy-in

● This is where you come into the picture!● How can OCT support your systems research?● We need your feedback!!!● Good news:

○ CloudLab community can use it from day one

○ MOC community can use it from day one

Page 32: Open Cloud Testbed: Developing a Testbed for the Research ......Open Cloud Testbed: Developing a Testbed for the Research Community Exploring Next-Generation Cloud Platforms Michael

Core Team

CNS-1925464