50
Joe Kaiser Not a Doctor, just A System Engineer

Stacki at the Seattle Scalability Meetup

  • Upload
    stackiq

  • View
    214

  • Download
    2

Embed Size (px)

Citation preview

Joe Kaiser

Not a Doctor, just A System Engineer

Open Source Stack Installer

Stacki is a very fast and ultra reliable Linux server provisioning tool … at scale.

With zero prerequisites for taking systems from bare metal to a ping and prompt.

History

History

Roots in Open Source

Started life as the Rocks Project at UCSD

Started in May ’00.

6 month project going on ~16 years

Roots in the HPC world

What Problem are we trying to Solve?

Problem

OS Provisioning

Disk Configuration

Disk Controller Configuration

Disk Partitioning

Network Configuration

Services configuration

Application Deployment

Life-cycle management of the cluster

Server Provisioning

Problem – Contd. …

Datacenter Provisioning

Server Provisioning

Heterogeneous Hardware

Complex Network Configuration

Bonding

Bridging

VLANs

Combinations of the above

Network Architecture

Network Architecture

Datacenter Architecture

Frontend

Network 1 Network 2 Network 3 Network 4

Challenge

Fast

Correct

Consistent

Repeatable

How We Solve the Problem

From Bare Metal Up

Take complete control of the Stack

Modified CentOS Installer

Parallel package sharing installer

Database to keep persistent data about the System

Command Line to interact with Stacki

Dynamic Kickstart File Generation

Frontend Services

Services to build backend nodes

DHCP – MAC to IP address Mapping

TFTP – Serve out PXE files, Installation Kernel, and RAM Disk

Apache – Serve Kickstart files

DNS (optional)

Services to access backend nodes

SSH key management

Parallel execution shell

Stacki Positioning

DevOps / Configuration Tool

DHCP /

DNS / TFTPNetworkDiskOS

In-house

developed

deployment

tools

- Disk Array Controller Configuration

- Disk Partitioning Configuration

Download and Boot the ISO

Download the ISO from www.stacki.com

It’s 1.5 GB

stacki pallet

Subset of CentOS 6.7

Boot the ISO on the host that will be your frontend

Timezone

Network Configuration

Root Password

Partitioning

Pallet Selection

Summary

Installation

Installation

Frontend Services

Services to build backend nodes

DHCP – MAC to IP address Mapping

TFTP – Serve out PXE files, Installation Kernel, and RAM Disk

Apache – Serve Kickstart files

DNS (optional)

Services to access backend nodes

SSH key management

Parallel execution shell

Adding Hosts

Method 1: Discovery

Advantages

Prior knowledge of MAC addresses Not Required

Automatic Sensible Hostname, IP address assignment

Disadvantages

Automatic Sensible Hostname, IP address assignment

Complex network configuration has to be done post-installation

Run

# insert-ethers

Discovery

Discovery

Discovery

Adding Hosts

Method 2: Host Configuration Spreadsheet

Advantages

Complete control of Hostname, IP address, and network assignments

Easy to make changes

Fits very well with existing datacenter management processes.

Lots and lots of Error Checking

Disadvantages

A little tedious the first time around

Requires prior knowledge of

MAC addresses,

IP address assignments

Physical location of machines (Rack & Position)

Host Configuration Spreadsheet

Backend Installation

Save your Host Configuration spreadsheet as a CSV

Import CSV on frontend

# stack load hostfile file=hosts.csv

Tell backend nodes to install on their next PXE boot

# stack set host boot backend action=install

PXE boot all backend nodes

Go!

BitTorrent-Inspired Package Installation

Stacki

Advanced Networking

Host Configuration Spreadsheet

Advanced Networking

Advanced Network Configuration

Bonded interfaces

VLANs

Bridging

Any combo of the above

Multiple Subnets

Build a single cluster from hosts in multiple subnets

Manage hosts in multiple datacenters

Disk Configuration

Disk Controller Configuration

Disk Controller Support

LSI MegaRAID controller & derivatives

Intel MegaRAID

Dell MegaRAID

Cisco SAS MegaRAID

Any controller that supports the “storcli” or “megacli” command

HP Smart Storage Controller support

Supports RAID 0,1,5,6,10,50,etc.

Configure Controllers using Spreadsheets

# stack load storage controller

Disk Controller Configuration Spreadsheet

Disk Partitioning

Sensible Default Disk partitioning

Support for multiple disks

Support for file system options and mount options

Support for Software RAID configuration

Disk Partitioning through spreadsheets

# stack load storage partition

Disk Partition Configuration Spreadsheet

Software Footprint

Controlling Packages and Configuration

Pallets

Carts

Boxes

Distributions

Appliances

Pallets

Software Entity

Contains RPMS

Contains Configuration in the form of XML

Used for installation and configuration of an Application

Can be applied during Frontend installation or after the fact.

Each pallet functionally equal to a YUM repo with extra configuration

Example: Cloudera Pallet

Contains RPMS required to install the Cloudera Distribution of Hadoop

Contains scripts to configure and starts CDH

Example: Stacki with Cloudera Pallet

Check namenodes are

emptyFormat/start HDFS

Create all directories

Create all metastores

Start services (Hbase, Hive,

Oozie, Sqoop, Impala, etc)

Deploy client configuration Configure database

Setup/assign monitors

(activity, services, and host)

Test database connections

Validate/resolve hostnamesConsistent host timezones

No bad kernel versions

running

(CDH) version consistency

Java version consistencyDaemons versions

consistency

Mgmt Agents versions

consistencyHost specification/SSH

ports

MUCH MORE …

DHCP Server/Client setup TFTP/PXE configuration

Server OS installation

Node OS Install

RAID configuration

Boot configurationSystem/data disk

partitioning

Monitoring system setup

and config

Lights Out/IPMI setup

User accounts added and

syncedSSH keys on all hosts

Network node configuration

Config Mgmt install and

configuration

Route configurationOS upgrades/updates

Site specific software and

configuration

Host specification/SSH

ports

Security

Firewall setupCluster Mgmt utility Database install and config

Multiple network configPackage installation MUCH MORE …

App Config

Site Config

HW Install

Without Stacki

Stacki w/ Hadoop Pallet

Carts

Site Specific Pallets

Contains site-specific RPM

Contains site-specific configurations

Structurally and Functionally equivalent to a Pallet

Example: Client Cart

Contains RPMS to install DevOps tools

Contains custom post-install scripts to configure DevOps tools

Contains custom post-install scripts to run DevOps tools to bring system up to requisite configuration.

Boxes

Logical Entity

Loose collection of Pallets and Carts

One-to-Many mapping to Backend Hosts

OS Pallet

Cloudera Pallet

Stacki Pallet

Pallets Carts

PayPal Cart

Ansible Cart

Boxes

RedHat Pallet

Stacki Pallet

PayPal Cart

Application

RedHat Pallet

Cloudera Pallet

Ansible Cart

OS Pallet

Stacki Pallet

PayPal Cart

Default

Boxes

OS Pallet

Stacki Pallet

PayPal Cart

Default

Stacki Pallet

PayPal Cart

Application

RedHat Pallet

Cloudera Pallet

Ansible Cart

Multiple Distributions

Default Distribution

Based on stripped down CentOS 6.7 or 7.2

Used to build backend nodes

Multiple Distributions on Frontend

◦ E.g., RHEL 6.x based distribution, CentOS 6.7, etc.

Backend Nodes Distribution Mapping

Any Node can be mapped to any distribution

In Conclusion

Production Ready

Deploy large scale Big Data & OpenStack clusters very fast.

Deploy test systems to evaluate multiple applications with very short turn-around times

Deploy several small datacenters-in-a-rack that’s shipped out to customer sites.

Try it Out!

Website

www.stacki.com

Source Code

github.com/stackiq/stacki

Google Groups

groups.google.com/forum/#!forum/stacki