View
31.880
Download
2
Embed Size (px)
DESCRIPTION
Neil will teach you five advanced website traffic statistics that you NEED to be measuring, but probably aren't. It isn't good enough anymore to just measure click-through and conversion rates to your signup page. You need MUCH more detail and Neil will explain how to get it and make decisions accordingly. You'll be amazed at the increase in valuable sign-ups and revenue increases you can achieve.
Citation preview
LESSONS LEARNED AT REDDIT
Steve Huffman FOWA 2010
Reddit.com
A brief history of reddit
Founded in June 2005 Acquired by Condé Nast October 2007 7.5 Million user / month 270 Million page views / month Many mistakes along the way
Lesson 1: Crash!
…and restart. Daemontools (supervise) Single greatest improvement to uptime
we ever made. When in doubt, let it die. Don’t forget to read the logs!
Lesson 2: Separation of services Often, one->two machines more than
doubles performance. Group similar process together. Group similar types of data together. Better caching. Less contention for CPU. Avoid threads. Processes are easier to
separate later.
Lesson 3: Open Schema
ID UPS DOWNS
TITLE URL
12345 120 34 Buffins Create Zombie Dog!
www.someaussiesite.co.au/dog.html
12346 3 24 Check out my new blog!
noobspamer.blogspot.com
12347 509 167 Pee in a sink if you’ve ever voted up.
self
Lesson 3: Open Schema
In the early days:
Too much time spent thinking about the database.
Every feature required a schema update. Schema updates became more painful
as we grew. Maintaining replication was difficult. Deployment was complex.
Lesson 3: Open Schema
THING_ID KEY VALUE
12345 Title Boffins Create Zombie Dog!
12345 URL www.someaussiesite.com.au/zombiedog.html
12346 Title Pee in a sink if you’ve ever voted up.
12346 URL self
ID UPS DOWNS TYPE
12345 120 34 Link
12346 3 24 Link
Thing Data
Lesson 3: Open Schema
With an open schema:
Faster development Easier deployment Maintainable database replication No joins = easy to distribute Must be careful to maintain consistency
Lesson 4: Keep it stateless
Goal: any app server can handle any request
App server failure/restart is no big deal Scaling is straightforward Caching must be independent from a
specific app server.
Lesson 5: Memcache everything Database data Session data Rendered pages Memoizing internal functions Rate-limiting (user actions, crawlers) Storing pre-computing listings/pages Global locking Memcachedb for persistence
Lesson 6: Store redundant data Recipe for slow: keep data normalized
until you need it. If data has multiple presentations, store
it in multiple times in multiple formats. Disk and memory is less costly than
making your users wait.
Lesson 7: Work offline
Do the minimum amount of work to end the request.
Everything else can be done offline. An architecture of queues is simple and
easy to scale. AMQP/RabbitMQ.
Lesson 7: Work offline
Pre-computing listings Fetching thumbnails Detecting cheating Removing spam Computing awards Updating the “search” index
Lesson 7: Work offline
Master Databases
App Servers
Worker Databases
Cache
Precomputer
Thumbnailer
Spam
Request
Queue
THANKS! QUESTIONS?