Caching and tuning funfor high scalability
Wim GoddenCu.be Solutions
Notes about this presentation
This presentation was part of the FrOSCon 2011 program.
It was designed to presented live and as a result many of the slides may seem odd without spoken explanation.
The live benchmarks at the conference are ofcourse also not part of these slides.
Who am I ?
Wim Godden (@wimgtr)
Owner of Cu.be Solutions (http://cu.be)
PHP developer since 1997
Developer of OpenX
Zend Certified Engineer
Zend Framework Certified Engineer
MySQL Certified Developer
Who are you ?
Developers ?
System/network engineers ?
Managers ?
Caching experience ?
Caching and tuning funfor high scalability
Wim GoddenCu.be Solutions
Goals of this tutorial
Everything about caching and tuning
A few techniquesHow-to
How-NOT-to
Increase reliability, performance and scalability
5 visitors/day 5 million visitors/day
(Don't expect miracle cure !)
LAMP
LAMP
Architecture
Our test site
Our base benchmark
Apachebench = useful enough
Result ?
Caching
What is caching ?
CACHE
What is caching ?
x = 5, y = 2n = 50Same resultCACHE
select*fromarticlejoin useron article.user_id = user.idorder bycreated desclimit10
Doesn't changeall the time
Caching goals
Source of information (db, file, webservice, ) :Reduce # of request
Reduce the load
Latency :Reduce for visitor
Reduce for Webserver load
Network :Send less data to visitor
Hey, that's frontend !
Theory of caching
DB
Cache
HIT
Theory of caching
DB
Cache
MISS
1234675
Theory of caching
DB
Cache$data = get('key')falseGET /pagePageselect data from table$data = returned resultset('key', $data)if ($data == false)
Caching techniques
#1 : Store entire pages
Company Websites
Blogs
Full pages that don't change
Render Store in cache retrieve from cache
Caching techniques
#1 : Store entire pages
Caching techniques
#2 : Store parts of a page
Most common technique
Usually a small block in a page
Best effect : reused on lots of pages
Caching techniques
#2 : Store parts of a page
Caching techniques
#3 : Store SQL queries
SQL query cacheLimited in size
Caching techniques
#3 : Store SQL queries
SQL query cacheLimited in size
Resets on every insert/update/delete
Server and connection overhead
Goal :not to get rid of DB
free up DB resources for more hits !
Caching techniques
#3 : Store SQL queries
Caching techniques
#4 : Store complex processing results
Not just calculations
CPU intensive tasks :Config file parsing
XML file parsing
Loading CSV in an array
Save resources more resources available
Caching techniques
#4 : Store complex processing results
Caching techniques
#xx : Your call
Only limited by your imagination !
When you have data, think :Creating time ?
Modification frequency ?
Retrieval frequency ?
How to find cacheable data
New projects : start from 'cache everything'
Existing projects :Look at MySQL slow query log
Make a complete query log (don't forget to turn it off !)
Check page loading times
Caching storage - MySQL query cache
Use it
Don't rely on it
Good if you have :lots of reads
few different queries
Bad if you have :lots of insert/update/delete
lots of different queries
Caching storage - Disk
Data with few updates : good
Caching SQL queries : preferably not
DON'T use NFS or other network file systemsespecially for sessions
high latency
locking issues !
Caching storage - Disk / ramdisk
Overhead : filesystem access
Limited number of files per directory Subdirectories
Local5 Webservers 5 local caches Hard to scale
How will you keep them synchronized ? Don't say NFS or rsync !
Caching storage - Memcache
Facebook, Twitter, Slashdot, need we say more ?
Distributed memory caching system
Multiple machines 1 big memory-based hash-table
Key-value storage systemKeys - max. 250bytes
Values - max. 1Mbyte
Caching storage - Memcache
Facebook, Twitter, Slashdot, need we say more ?
Distributed memory caching system
Multiple machines 1 big memory-based hash-table
Key-value storage systemKeys - max. 250bytes
Values - max. 1Mbyte
Extremely fast... non-blocking, UDP (!)
Memcache - where to install
Memcache - where to install
Memcache - installation & running it
InstallationDistribution package
PECL
Windows : binaries
RunningNo config-files
memcached -d -m -l -p
ex. : memcached -d -m 2048 -l 127.0.0.1 -p 11211
Caching storage - Memcache - some notes
Not fault-tolerantIt's a cache !
Lose session data
Lose shopping cart data
...
Caching storage - Memcache - some notes
Not fault-tolerantIt's a cache !
Lose session data
Lose shopping cart data
Different librariesOriginal : libmemcache
New : libmemcached (consistent hashing, UDP, binary protocol, )
Firewall your Memcache port !
Memcache in code