Zotonic - Make it Faster

ZOTONICMake it fast(er)

Marc Worrell — WhatWebWhat / Maximonster

Cross Functional Amsterdam — 2013, Jan 15th.

Let’s make a website

(Just because I need one)

I have <? PHP ?>

It is on this machine.

Everyone uses it.

So it must be good.

Let’s use it... (and think later)

I use <? PHP ?>

Done!

I use <? PHP ?>

Hurray visitors!

I use <? PHP ?>

Oh no! visitors!

I used <? PHP ?>

•What happened?

• I got mentioned on popular blog

• Too many PHP+Apache processes

•Melt down

I can use <? PHP ?>

•Of course you can

• Use more hardware

• Use caching proxy

• Use xyz and a bit of abc

• Add complexity

• And keep it all running, all the time

I can use ... (fill in)

• Same for RoR, Django, ...

• Problem is not that you can’t scale

• The problem is that you need to scale immediately

What happened?

•Many people followed popular link

• A process per request

• Death by too many processes

• ... doing the same thing!

• Known as the “Slash Dot Effect”

Is this typical?

•Web sites are quite small (less than a million pages)

• Except for a couple of huge ones

• Not visited that much (less than 10 pages per second)

• Unless linked from popular place

• Relative small set of “hot data”

Then we just cache those pages!?

•Modern web, realtime

• Push, Web Sockets, Personalization

• That is open connections

•More xyz and abc needed

• And... cache invalidation

That is why we made Zotonic

With the help of Open Source, a great community, and ... you?

That is why we made Zotonic

Zotonic because ...

• A web server should easily handle the load of 99.9999% of all web sites.

• You shouldn’t be bothering xyz and abc to make your site real time

• The front end developer needs to be happy and in the driver’s seat

• Do more with less hardware (and watts)

Erlang because ...

•Many parallel connections

• Share everything (more later)

•Modern CPUs are multi core, your software should be using them

• Fault tolerance makes a developer’s life easier

• It is stable, very stable

Make it efficient

• Start with a monorail solution

• Fast enough for that 99.9999% of sites

• Rethink the way requests are handled

The stack

'DWDEDVH

+773�VHUYHU

&RQWUROOHUV

7HPSODWHV

6HUYLFHV��$

3,

:HEVRFNHWV

0RGHOV

0LGGOHZDUH�SURFHVVHV

�FDFKLQJ��VHVVLRQV��UHQGHUHU��LPDJH�UHVL]HU��PRGXOH�

PDQDJHU��WUDQVODWLRQ�WDEOHV��

85/�GLVSDWFKHU

Steps for a request

• Accept

• Parse request

• Dispatch (match host, controller)

• Render template (fetch data)

• Serve result

Where is time spent?

• Simple request: 6000/sec

• Lots of content: 10/sec (or less)

• Fetching data and rendering should be optimized

What takes time?

• Fetch data from database

• Simple query round trip is 1 to 10 msec

• Fetch data from a caching server

• Network round trip is 0.5 msec

• Don’t hit the network or the database

What saves time?

• Don’t repeat things that you can do once a long time ago

• Like HTML escaping and content filtering

• Zotonic stores sanitized content

What saves ...?

• Combine similar actions into one

• Especially when they are happening at the same time

• Think of requests, db results, calculations etc. etc.

• Share database connections, protect your database

Life of a request

• Client side

• Server side

• Templates

• In memory caching

Client side

• Let client (and proxies) cache css, javascript, images etc.

• Combine css and javascript requests:http://example.org/lib/bootstrap/css/bootstrap~bootstrap-responsive~bootstrap-base-site~/css/jquery.loadmask~z.growl~z.modal~site~63523081976.css

http://example.org/lib/bootstrap/css/bootstrap~bootstrap-responsive~bootstrap-base-site~/css/jquery.loadmask~z.growl~z.modal~site~63523081976.css












Server side

• File requests are easily cached

• Checks on timestamp, modification dates

• Cache both compressed and un-compressed version

• Still access control checks for content (images, pdfs etc.)

Templates

• Do all the model interactions

• Drive page rendering

• Compiled into Erlang byte code

•Whole and partial caching possible

• Cached results are still dynamic

In memory caching

• Two layers

•Memo cache in process dictionary of request handler

• Central shared cache for whole site: depcache

Memo cache

• In process heap of request handler

•Quick access to often used values

• Resources, ACL checks etc.

• Flushed on writes and when growing too big

• Always enabled when rendering templates

Depcache

• Central cache per site

• Key dependencies for consistency

• Garbage collector thread

•Memo functionality for sharing results between processes

Erlang VM considerations

• Cheap processes

• Expensive data copying on messages

• Binaries have their own heap

• String processing is expensive (as in any language)

Erlang VM and Zotonic

• Big data structure #context{}

• Do most work in a single process

• Prune when messaging

• Don’t pass #context{} when messaging (use interface functions)

•Messaging binaries is ok

Aside: webmachine

• HTTP protocol abstraction

• Great for writing controllers

• Needed some work for Zotonic:

• Dispatch list copying

• Memo of some lookups

• Optimizations (process dictionary removal, combine data structures)

Slam dunk protection

• Happens on startup, change of images, templates, cache flush etc.

• Let individual requests fail but system continue

• Methods:

• Single template compiler

• Single image resize process

• Memo cache (share computation)

Database

• Complaint: “It is SQL, it can’t scale”

• Answer: “Wrong question”

Why PostgreSQL?

• Great stability

• Scales good with multi core

•Mature driver

• Full text indexing

• SQL gives good query support

• Known problems and performance

Why PostgreSQL?

(And we serialize most data into a blob anyway)

NoSQL?

• Tooling (backup, command line)

• Indexing has unproven performance

•Moving target

• Vendor lock in - too big differences

• Unproven disk stores

(No)SQL

• Stop worrying

•Optimize the layers above your data store

• Think how you query your data

• Think how you protect your data

Future

• Use more binaries (exchange MochiWeb?)

• Faster template compilation and startup

• Pluggable databases

• Better resource pooling

• Circuit breakers for stability

• Add options to use stale data

Future: zynamo

• Distributed version of Zotonic

• For better availability

• ... and scale to more data

• Each node is created equal

• Still SQL stores

•Ongoing work

It works

Come and join us!

• See us at http://zotonic.com/ (we have a movie)

• Follow us at @zotonic

• Join the community, we have exciting stuff to do

http://zotonic.com

http://zotonic.com

¿Questions?

Documents

Zotonic - Make it Faster