25
Super-sizing YouTube with Python Mike Solomon [email protected]

Super Sizing Youtube with Python

  • Upload
    didip

  • View
    17.332

  • Download
    2

Embed Size (px)

DESCRIPTION

by Mike Solomon. See more scalability tales at: http://rapd.wordpress.com

Citation preview

Super-sizing YouTube with Python

Mike [email protected]

this is about scaling a web application

there are a lot of things left out - mostly mistakes and implementation details

this may generate more questions than it answers

my goal is to give you ideas for solving your own problems

Welcome

this is the core of scalability

systems change over time, so will your architecture

impossible to predict the optimal approach

start simple

aim for local maxima

python enables flexibility

Architecture

web boxes do everything

servlets, images, thumbnails, search

shoehorn everything into Apache, MySQL

very simple

this survives longer than you'd think

YouTube's Early Days

Early Web Stackcirca January ‘06

hw load balancer

db master

db replicas

mod_python

httpd

biz logic

servlets

templates

db objects thumbnailssearch

really small team

we ♥ python

logical separation in code

discipline and honor - not linguistically enforced (don’t waste time writing code to restrict people)*

grown by systematically removing bottlenecks

easy to know when something is a `win`

Early Key Factors in Engineering

user demand can grow 50% in a day

removing one bottleneck can immediately reveal another (usually more heinous)

replace and migrate components as they become problems

good (python) components make this easy

obviously, pick your battles

Running Without Tripping

minimize dependencies*

accept some latency

localize failures - don’t let them spread

you are only down if it looks like you are

applies to both systems and software

Good Components (Hypothetical)

more efficient resource utilization via specialized deployment

balance based on CPU, RAM, network and disk usage patterns

overlay orthogonal loads

disjoint tasks running on the same physical hardware

Balance Machine Resources

move from mod_python to mod_fastcgi

move thumbnails to their own machines

make search to a remote service running on separate machines

run transcoder processes on video servers

do more with the same hardware

Migratory Patterns of the Norwegian Blue

Serenity Now

Can you spot where we turned on transcoding processes?

if you have a relational database, it will be abused

difficult to track the true source

series of object proxies for DB-API enable logging

encode a portion of call stack as a query comment* (more about this later)

SQL Shenanigans

take pressure off of relational db

can save additional resources if your objects require significant computation to set up

memcached makes a good home for this

need good client to make this into a truly useful service ‡

pools and better failure handling

Object Caching

fast vs fast enough

strive for machine efficiency - don't obsess

be scientific - collect data and understand it

can yield some surprising results

don't assume code optimization techniques from another language are relevant

just like carpentry, measure twice cut once

Software Optimization

pure python HMAC was 40% of web cpu

write a few lines of C

threaded comments fiasco

overly complex algorithm to compute the display object tree

simplify query, simplify algorithm

Python Optimization

psyco - specializing compiler for Python

'hot' functions are psyco-ized

there is a 'context switch' penalty so you need to experiment to see if it helps

previous threaded comments algorithm

-closure +psyco = 400% boost

Python Optimization

pruned all the obvious leaf services

dynamic web requests are one `service`

web service is easy to scale, so it stresses out other resources - probably a DB

DB’s are hard(er) to scale

tricks of escalating cleverness‡

eventually, no cards left to play

Reasonable Efficiency

pretty much have to go horizontal

choose your partition plan carefully

understand your data access patterns

what queries do you run most often?

do you have joins?

do you need transactional consistency? why?

does an 'entity' emerge?

Scaling MySQL

entities are 'transactional'

allow joins across properties of an entity

entities are migratory

cross entity is more complicated

weaken guarantees to make it easier

minimize activity by design

Partition By Entity

connection and transaction management

lookup service

query factory

minimalist table abstraction

ORM can be (is?) evil

make common behaviors simple, while leaving some transparency to the actual database

EMD, a TLA not an ORM!

apply this fundamental change to a large and growing site

make it relatively painless with python

multiple inheritance

decorators

AST plugins for validation and testing

Seismic Retrofit

all the scale-aware code nicely opaque to application developers

base use cases are painless User.select_by_username(db_context, username)

Video.select_by_id(db_context, video_id)

Video.select_by_user_id(db_context, user_id)

Resulting API

hijack mysql replication to partition on the fly while the live site is running

all DML gets tagged with an entity id

read master binlog and selectively replay it into a set of new mini-masters

update lookup service to point to new resources

Bulk Entity Migration

Recurring Themes

the elegance of simplicity

take reliable open software and customize it

`pythonic veneer`

DIY - filing a ticket for a bugfix doesn’t give me a warm feeling - take matters into your own hands*

Questions?