Building a Massively Scalable Cloud Service from the Grounds Up

Preview:

DESCRIPTION

Serving developer binaries isn’t trivial. Such binaries are consumed by tools ,and create massive request load. Add to that support for metadata, REST API, storage quotas, stats, repo indexes on demand and global HA distribution, and you’ve got yourself a pretty complicated system to run and manage. This talk will show you how Bintray, JFrog’s social binary distribution service, works. We will speak about how the system segmentation supports massive loads across data centers with stateless vertical scaling; how Grails applications scale and how we tie up different NoSQL technologies such as CouchDB, MongoDB, ElasticSearch & Redis; how we chose between physical and virtual servers and how we manage deployments without service interruption.

Citation preview

Building a Massively Scalable Cloud Service

from the Grounds Up

Yoav Landman

@yoavlandman

github.com/yoav

cee tee oh @ JFrog

What Frog?

What Frog?

What Frog?

What Frog?

So…

Some Numbers ___________ liftoff + 5 months

Some Numbers ___________ liftoff + 5 months

Users 7K

Some Numbers ___________ liftoff + 5 months

Users 7K

Packages 70K

Some Numbers ___________ liftoff + 5 months

Users 7K

Packages 70K

Requests 1.2 B/Month

Requirements ___________

Requirements

– Download binaries

___________

Requirements

– Download binaries – Web Front

___________

Requirements

– Download binaries – Web FRONT – REST API

___________

Requirements

– Download binaries – Web FRONt – REST API – Backend services

___________

We know developers

%new_sexy_lang% community

Not our fault! AWS failed again!

Downloads must…

Web application must…

Backend Services must…

Choose your battles...

Non-Func. Requirements _________________

Non-Func. Requirements

Requirement RPS Availability

_________________

Non-Func. Requirements

Requirement RPS Availability

Download 10K Always

_________________

Non-Func. Requirements

Requirement RPS Availability

Download 10K Always

Interaction 200 Almost always

_________________

Non-Func. Requirements

Requirement RPS Availability

Download 10K Always

Interaction 200 Almost always

Services 10 Most of the time

_________________

Download Server

No Servlets here

Deduplication by Checksum

File  A:  46b34  

File  B:  a64ff7  

/user-­‐a/repo-­‐z/package-­‐y/file-­‐x  

/org-­‐c/repo-­‐m/package-­‐n/file-­‐k  

/user-­‐m/repo-­‐w/package-­‐t/file-­‐f  

Flat blobs storage

File  A:  46b34  

File  B:  a64ff7  

Mapping

/user-­‐m/repo-­‐w/package-­‐t/file-­‐f  

Web Front

Web Front

Web Framework

Requirements ___________

Requirements

– Rapid Application Development

___________

Requirements

– Rapid Application Development – Flexible schema

___________

Requirements

– Rapid Application Development – Flexible schema – Java Background

___________

Requirements

– Rapid Application Development – Flexible schema – Java Background – Stateless

___________

Why don’t you just use...?

Framework Why not?

________________

Why don’t you just use...?

Framework Why not?

Angular.js Ember.js æж.js Maturity

________________

-  

Why don’t you just use...?

Framework Why not?

Angular.js Ember.js æж.js Maturity

Wicket State

________________

-  

Why don’t you just use...?

Framework Why not?

Angular.js Ember.js æж.js Maturity

Wicket State

JSF Model

________________

-  

Why don’t you just use...?

Framework Why not?

Angular.js Ember.js æж.js Maturity

Wicket State

JSF Model

Non-java No java bg

________________

-  

Updated Grails to newer minor

Web Front

Data Model

Remember?

Grails means Gorm!

Gorm MongoDB plugin

Web Front

Search

Search

2 types of search Full Text Search Structured Search

2 types of search Full Text Search Structured Search

Executive summary

Framework Why not?

________________

Executive summary

Framework Why not?

Lucene/compass

Only embedded, resource guzzler

________________

Executive summary

Framework Why not?

Lucene/compass

Only embedded, resource guzzler

solr Bad grails integration

________________

Executive summary

Framework Why not?

Lucene/compass

Only embedded, resource guzzler

solr Bad grails integration

sphynx No incremental index

________________

vs.

vs.

You ask

ElasticSearch answers

Additional Services

Additional Services

Indexes, Statistics, Logs

Also, Redis to the resque

Did they just add a 4th nosql?!

Additional Services

Documentation

DevOps

IaaS vs. SaaS

Leave it to the Pros

SaaS for Download Service

Component SaaS

_________________

SaaS for Download Service

Component SaaS

blob storage SL objectstore

_________________

SaaS for Download Service

Component SaaS

blob storage SL objectstore

mapping Cloudant

_________________

SaaS for Web and services

Component SaaS

_________________

SaaS for Web and services

Component SaaS

Model Mongohq

_________________

SaaS for Web and services

Component SaaS

Model Mongohq

Grails N/A

_________________

SaaS for Web and services

Component SaaS

Model Mongohq

Grails N/A

ElasticSearch N/A

_________________

SaaS for Web and services

Component SaaS

Model Mongohq

Grails N/A

ElasticSearch N/A

Redis N/A

_________________

Physical vs. Virtual

Remember this?

Virtualization __________

Virtualization __________ Pros

Virtualization __________ Pros – Cheap

Virtualization __________ Pros – Cheap

– elastic

Virtualization __________ Pros – Cheap

– elastic – Volatile

Virtualization __________ Pros – Cheap

– elastic – Volatile cons

Virtualization __________ Pros – Cheap

– elastic – Volatile cons – Overhead

Virtualization __________ Pros – Cheap

– elastic – Volatile cons – Overhead – Tenant, not owner

Development Environment

Remember?

We are liberal

We are liberal

We are liberal

We are liberal

The Solution

The Solution

The Solution

The Solution

Chef What?

Opscode Chef

Opscode Chef

Opscode Chef

Opscode Chef

The Solution

The Solution

Vagrant Who?

Vagrant

Vagrant

Vagrant

Vagrant

Vagrant

Development

Development

Development

Development

Ops are part of the DevOps

1.  Vagrant boots centos on virtualbox

1.  Vagrant boots centos on virtualbox

2.  Chef installs all db and service rpms from private YUM repo

1.  Vagrant boots centos on virtualbox

2.  Chef installs all db and service rpms from private YUM repo

3.  Profit!

High Availability

(And Locality)

Cluster everything

Remember?

CDN for Download Server

GTD for Web Application

Backup

(and Vendor Lock-Out)

Snapshots and replicas

Monitoring

(Servers, State and Logs)

Prevent this:

Going to Production…

Remember?

The Solution

All together now

Conclusions time ______________

Conclusions time

– Define Criticality

______________

Conclusions time

– Define Criticality – Embrace the change

______________

Conclusions time

– Define Criticality – Embrace the change – Plan for scale, but be realistic

______________

Conclusions time

– Define Criticality – Embrace the change – Plan for scale, but be realistic

– Backup everything!

______________

No, thank you!

Recommended