Upload
marc-worrell
View
358
Download
0
Embed Size (px)
DESCRIPTION
Presentation at the Cross Functional Amsterdam meetup Jan 15th, 2013.Performance optimizations and considerations for the Erlang CMS and Framework Zotonic.
Citation preview
ZOTONICMake it fast(er)
Marc Worrell — WhatWebWhat / Maximonster
Cross Functional Amsterdam — 2013, Jan 15th.
Let’s make a website
(Just because I need one)
I have <? PHP ?>
It is on this machine.
Everyone uses it.
So it must be good.
Let’s use it... (and think later)
I use <? PHP ?>
Done!
I use <? PHP ?>
Hurray visitors!
I use <? PHP ?>
Oh no! visitors!
I used <? PHP ?>
•What happened?
• I got mentioned on popular blog
• Too many PHP+Apache processes
•Melt down
I can use <? PHP ?>
•Of course you can
• Use more hardware
• Use caching proxy
• Use xyz and a bit of abc
• Add complexity
• And keep it all running, all the time
I can use ... (fill in)
• Same for RoR, Django, ...
• Problem is not that you can’t scale
• The problem is that you need to scale immediately
What happened?
•Many people followed popular link
• A process per request
• Death by too many processes
• ... doing the same thing!
• Known as the “Slash Dot Effect”
Is this typical?
•Web sites are quite small (less than a million pages)
• Except for a couple of huge ones
• Not visited that much (less than 10 pages per second)
• Unless linked from popular place
• Relative small set of “hot data”
Then we just cache those pages!?
•Modern web, realtime
• Push, Web Sockets, Personalization
• That is open connections
•More xyz and abc needed
• And... cache invalidation
That is why we made Zotonic
With the help of Open Source, a great community, and ... you?
That is why we made Zotonic
Zotonic because ...
• A web server should easily handle the load of 99.9999% of all web sites.
• You shouldn’t be bothering xyz and abc to make your site real time
• The front end developer needs to be happy and in the driver’s seat
• Do more with less hardware (and watts)
Erlang because ...
•Many parallel connections
• Share everything (more later)
•Modern CPUs are multi core, your software should be using them
• Fault tolerance makes a developer’s life easier
• It is stable, very stable
Make it efficient
• Start with a monorail solution
• Fast enough for that 99.9999% of sites
• Rethink the way requests are handled
The stack
'DWDEDVH
+773�VHUYHU
&RQWUROOHUV
7HPSODWHV
6HUYLFHV���$
3,
:HEVRFNHWV
0RGHOV
0LGGOHZDUH�SURFHVVHV
�FDFKLQJ��VHVVLRQV��UHQGHUHU��LPDJH�UHVL]HU��PRGXOH�
PDQDJHU��WUDQVODWLRQ�WDEOHV������
85/�GLVSDWFKHU
Steps for a request
• Accept
• Parse request
• Dispatch (match host, controller)
• Render template (fetch data)
• Serve result
Where is time spent?
• Simple request: 6000/sec
• Lots of content: 10/sec (or less)
• Fetching data and rendering should be optimized
What takes time?
• Fetch data from database
• Simple query round trip is 1 to 10 msec
• Fetch data from a caching server
• Network round trip is 0.5 msec
• Don’t hit the network or the database
What saves time?
• Don’t repeat things that you can do once a long time ago
• Like HTML escaping and content filtering
• Zotonic stores sanitized content
What saves ...?
• Combine similar actions into one
• Especially when they are happening at the same time
• Think of requests, db results, calculations etc. etc.
• Share database connections, protect your database
Life of a request
• Client side
• Server side
• Templates
• In memory caching
Client side
• Let client (and proxies) cache css, javascript, images etc.
• Combine css and javascript requests:http://example.org/lib/bootstrap/css/bootstrap~bootstrap-responsive~bootstrap-base-site~/css/jquery.loadmask~z.growl~z.modal~site~63523081976.css
Server side
• File requests are easily cached
• Checks on timestamp, modification dates
• Cache both compressed and un-compressed version
• Still access control checks for content (images, pdfs etc.)
Templates
• Do all the model interactions
• Drive page rendering
• Compiled into Erlang byte code
•Whole and partial caching possible
• Cached results are still dynamic
In memory caching
• Two layers
•Memo cache in process dictionary of request handler
• Central shared cache for whole site: depcache
Memo cache
• In process heap of request handler
•Quick access to often used values
• Resources, ACL checks etc.
• Flushed on writes and when growing too big
• Always enabled when rendering templates
Depcache
• Central cache per site
• Key dependencies for consistency
• Garbage collector thread
•Memo functionality for sharing results between processes
Erlang VM considerations
• Cheap processes
• Expensive data copying on messages
• Binaries have their own heap
• String processing is expensive (as in any language)
Erlang VM and Zotonic
• Big data structure #context{}
• Do most work in a single process
• Prune when messaging
• Don’t pass #context{} when messaging (use interface functions)
•Messaging binaries is ok
Aside: webmachine
• HTTP protocol abstraction
• Great for writing controllers
• Needed some work for Zotonic:
• Dispatch list copying
• Memo of some lookups
• Optimizations (process dictionary removal, combine data structures)
Slam dunk protection
• Happens on startup, change of images, templates, cache flush etc.
• Let individual requests fail but system continue
• Methods:
• Single template compiler
• Single image resize process
• Memo cache (share computation)
Database
• Complaint: “It is SQL, it can’t scale”
• Answer: “Wrong question”
Why PostgreSQL?
• Great stability
• Scales good with multi core
•Mature driver
• Full text indexing
• SQL gives good query support
• Known problems and performance
Why PostgreSQL?
(And we serialize most data into a blob anyway)
NoSQL?
• Tooling (backup, command line)
• Indexing has unproven performance
•Moving target
• Vendor lock in - too big differences
• Unproven disk stores
(No)SQL
• Stop worrying
•Optimize the layers above your data store
• Think how you query your data
• Think how you protect your data
Future
• Use more binaries (exchange MochiWeb?)
• Faster template compilation and startup
• Pluggable databases
• Better resource pooling
• Circuit breakers for stability
• Add options to use stale data
Future: zynamo
• Distributed version of Zotonic
• For better availability
• ... and scale to more data
• Each node is created equal
• Still SQL stores
•Ongoing work
It works
Come and join us!
• See us at http://zotonic.com/ (we have a movie)
• Follow us at @zotonic
• Join the community, we have exciting stuff to do
¿Questions?