If you can't read please download the document
Upload
serge-smetana
View
17.554
Download
1
Embed Size (px)
Citation preview
Advanced Performance Optimizationof Rails Applications
Serge SmetanaRuPy 2009
www.acunote.com
What Am I Optimizing?
Acunote www.acunote.comOnline project management and scrum softwareRuby on Rails application since inception in 2006
~5300 companies
~13000 users
Hosted on Engine Yard
Hosted on Customer's Servers
nginx + mongrel
PostgreSQL
Performance Degradation Over Time
April 2008
May 2008
June 2008
July 2008
Request Time (on development box), %
Actually Happens: O(nc)
Best Case: O(log n)
Solutions?
Throw Some Hardware at it!
Solutions?
Performance Optimization!
What to optimize?
What To Optimize?
Development?
What To Optimize?
Development
AND Production
How to optimize?
How To Optimize?
Three rules of
performance optimization
Three Rules Of Performance Optimization
1. Measure!
Three Rules Of Performance Optimization
2. Optimize only what's slow!
Three Rules Of Performance Optimization
3. Optimize for the user!
Things To Optimize
DevelopmentRuby code
Rails code
Database queries
Alternative Ruby
ProductionShared filesystems and databases
Live debugging
Load balancing
FrontendHTTP
Javascript
Internet Explorer
Optimizing Ruby: Date Class
What's wrong with Date?
> puts Benchmark.realtime { 1000.times { Time.mktime(2009, 5, 6, 0, 0, 0) } }0.005> puts Benchmark.realtime { 1000.times { Date.civil(2009, 5, 6) } }0.080
16x slower than Time! Why?
%self total self wait child calls name 7.23 0.66 0.18 0.00 0.48 18601 #reduce 6.83 0.27 0.17 0.00 0.10 5782 #jd_to_civil 6.43 0.21 0.16 0.00 0.05 31528 Rational#initialize 5.62 0.23 0.14 0.00 0.09 18601 Integer#gcd
Optimizing Ruby: Date Class
Fixing Date: Use C, Luke!
Date::Performance gem with Date partially rewritten in Cby Ryan Tomayko (with patches by Alex Dymo in 0.4.7)
> puts Benchmark.realtime { 1000.times { Time.mktime(2009, 5, 6, 0, 0, 0) } }0.005> puts Benchmark.realtime { 1000.times { Date.civil(2009, 5, 6) } }0.080
> require 'date/performance'puts Benchmark.realtime { 1000.times { Date.civil(2009, 5, 6) } }0.006
git clone git://github.com/rtomayko/date-performance.gitrake package:buildcd dist && gem install date-performance-0.4.8.gem
Optimizing Ruby: Date Class
Real-world impact of Date::Performance:
Before: 0.95 secAfter: 0.65 sec1.5x!
Optimizing Ruby: Misc
Use String:: long_string = "foo" * 100000> Benchmark.realtime { long_string += "foo" }0.0003> Benchmark.realtime { long_string n = BigDecimal("4.5")> Benchmark.realtime { 10000.times { n 4.5 } }0.063> Benchmark.realtime { 10000.times { n BigDecimal("4.5") } }0.014
in theory:4.5xin practice:1.15x
in theory:75xin practice:up to 70x
Things To Optimize
DevelopmentRuby code
Rails code
Database queries
Alternative Ruby
ProductionShared filesystems and databases
Live debugging
Load balancing
FrontendHTTP
Javascript
Internet Explorer
Optimizing Rails: String Callbacks
What can be wrong with this code?
class Task < ActiveRecord::Base before_save "some_check()"end...100.times { Task.create attributes}
Kernel#binding is called to eval() the string callbackThat will duplicate your execution context in memory!More memory taken => More time for GC
Optimizing Rails: String Callbacks
What to do
class Task < ActiveRecord::Base before_save :some_checkend
Optimizing Rails: Partial Rendering
Not too uncommon, right?
#1000 times 'object', :locals => { :object => object } %>
We create 1000 View instances for each object here!Why?
list.rhtml
Optimizing Rails: Partial Rendering
Template inlining for the resque:
#1000 times 'object', :locals => { :object => object },:inline => true %>
list.rhtml_object.rhtml_object.rhtml_object.rhtml_object.rhtml_object.rhtml_object.rhtml_object.rhtml_object.rhtml
Optimizing Rails: Partial Rendering
Template Inliner plugin:http://github.com/acunote/template_inliner/
Real world effect from template inlining:
Rendering of 300 objects, 5 partials for each objectwithout inlining:0.89 secwith inlining:0.75 sec
1.2x
Things To Optimize
DevelopmentRuby code
Rails code
Database queries
Alternative Ruby
ProductionShared filesystems and databases
Live debugging
Load balancing
FrontendHTTP
Javascript
Internet Explorer
Optimizing Database
How to optimize PostgreSQL:explain analyzeexplain analyzeexplain analyze...
Optimizing Database: PostgreSQL Tips
EXPLAIN ANALYZE explains everything, but...... run it also for the "cold" database state!
Example: complex query which works on 230 000 rows anddoes 9 subselects / joins:cold state: 28 sec, hot state: 2.42 sec
Database server restart doesn't helpNeed to clear disk cache: sudo echo 3 | sudo tee /proc/sys/vm/drop_caches (Linux)
Optimizing Database: PostgreSQL Tips
Use any(array ()) instead of in()
to force subselect and avoid join
explain analyze select * from issues where id in (select issue_id from tags_issues);
QUERY PLAN------------------------------------------------------------------------------------------------------------------------------------------------------- Merge IN Join (actual time=0.096..576.704 rows=55363 loops=1) Merge Cond: (issues.id = tags_issues.issue_id) -> Index Scan using issues_pkey on issues (actual time=0.027..270.557 rows=229991 loops=1) -> Index Scan using tags_issues_issue_id_key on tags_issues (actual time=0.051..73.903 rows=70052loops=1) Total runtime: 605.274 ms
explain analyze select * from issues where id = any( array( (select issue_id from tags_issues) ) );
QUERY PLAN------------------------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on issues (actual time=247.358..297.932 rows=55363 loops=1) Recheck Cond: (id = ANY ($0)) InitPlan -> Seq Scan on tags_issues (actual time=0.017..51.291 rows=70052 loops=1) -> Bitmap Index Scan on issues_pkey (actual time=246.589..246.589 rows=70052 loops=1) Index Cond: (id = ANY ($0)) Total runtime: 325.205 ms
2x!
Database Optimization: PostgreSQL Tips
Push down conditions into subselects and joinsPostgreSQL often won't do that for you
select *,(select notes.author from notes where notes.issue_id = issues.id) as note_authorsfrom issueswhere org_id = 1
select *,(select notes.author from notes where notes.issue_id = issues.id and org_id = 1) as note_authorsfrom issueswhere org_id = 1
Issuesidserialnamevarcharorg_idinteger
Notesidserialnamevarcharissue_idintegerorg_idinteger
What To Do?
Optimize For Development BoxRuby code
Rails code
Database queries
Alternative Ruby
Optimize For ProductionShared filesystems and databases
Live debugging
Load balancing
Optimize For The UserHTTP
Javascript
Internet Explorer
Alternative Ruby
Everybody says "JRuby and Ruby 1.9 are faster"
Is that true in production?
Alternative Ruby
In short, YES!
= Acunote Benchmarks = MRI JRuby 1.9.1 Date/Time Intensive Ops 1.79 0.67 0.62Rendering Intensive Ops 0.59 0.44 0.40Calculations Intensive Ops 2.36 1.79 1.79Database Intensive Ops 4.87 4.63 3.66
Alternative Ruby
In short, YES!
= Acunote Benchmarks = MRI JRuby 1.9.1 Date/Time Intensive Ops 1x 2.6x 2.9xRendering Intensive Ops 1x 1.3x 1.5xCalculations Intensive Ops 1x 1.3x 1.3xDatabase Intensive Ops 1x 1x 1.3x
JRuby: 1.55x fasterRuby 1.9: 1.75x faster
Alternative Ruby
In short, YES!
= Acunote Benchmarks = MRI JRuby 1.9.1 Date/Time Intensive Ops 1x 2.6x 2.9xRendering Intensive Ops 1x 1.3x 1.5xCalculations Intensive Ops 1x 1.3x 1.3xDatabase Intensive Ops 1x 1x 1.3x
JRuby: 1.55x fasterRuby 1.9: 1.75x faster
Alternative Ruby
What is faster ?
Acunote Copy Tasks Benchmark MRI JRuby 1.9.1 Request Time 5.52 4.45 3.24 Template Rendering Time 0.35 0.21 0.21 Database Time 0.70 1.32 0.69 GC Time 1.07 N/A 0.62Faster template rendering!Less GC!JDBC database driver performance issue with JRuby?
Alternative Ruby
Why faster?
Alternative Ruby
Things I usually see in the profiler after optimizing:
%self self calls name 2.73 0.05 351 Range#each-1 2.73 0.05 33822 Hash#[]= 2.19 0.04 4 Acts::AdvancedTree::Tree#walk_tree 2.19 0.04 44076 Hash#[] 1.64 0.03 1966 Array#each-1 1.64 0.03 378 Org#pricing_plan 1.64 0.03 1743 Array#each 1.09 0.02 1688 ActiveRecord::AttributeMethods#respond_to? 1.09 0.02 1311 Hash#each 1.09 0.02 6180 ActiveRecord::AttributeMethods#read_attribute_before_typecast 1.09 0.02 13725 Fixnum#== 1.09 0.02 46736 Array#[] 1.09 0.02 15631 String#to_s 1.09 0.02 24330 String#concat 1.09 0.02 916 ActiveRecord::Associations#association_instance_get 1.09 0.02 242 ActionView::Helpers::NumberHelper#number_with_precision 1.09 0.02 7417 Fixnum#to_s
Alternative Ruby
# of method calls during one request:50 000 - Array35 000 - Hash25 000 - String
Slow classes written in Ruby:DateRational
Alternative Ruby
Alternative Rubys optimize mostly:the cost of function call
complex computations in pure Ruby
memory by not keeping source code AST
Alternative Ruby
Alternative Rubys optimize mostly:the cost of function call
complex computations in pure Ruby
memory by not keeping source code AST
Alternative Ruby
So, shall I use alternative Ruby?Definitely Yes!... but
JRuby:if your application works with it(run requests hundreds of times to check)Ruby 1.9:if all gems you need are ported
Things To Optimize
DevelopmentRuby code
Rails code
Database queries
Alternative Ruby
ProductionShared filesystems and databases
Live debugging
Load balancing
FrontendHTTP
Javascript
Internet Explorer
Optimizing For Shared Environment
Issues we experienced deploying on Engine Yard:
1) VPS is just too damn slow2) VPS may have too little memory to run the request!3) shared database server is a problem4) network filesystem may cause harm as well
Optimizing For Shared Environment
VPS may have too little memory to run the request
Think 512M should be enough?Think again.We saw requests that took 1G of memory!
Solutions:buy more memory
optimize memory
set memory limits for mongrels (with monit)
Optimizing For Shared Environment
You're competing for cache on a shared server:1. two databases with equal load share the cache
Optimizing For Shared Environment
You're competing for memory cache on a shared server:2. one of the databases gets more load and wins the cache
Optimizing For Shared Environment
As a result, your database can always be in a "cold" stateand you read data from disk, not from memory!complex query which works on 230 000 rows anddoes 9 subselects / joins:from disk: 28 sec, from memory: 2.42 sec
Solutions: optimize for the cold state
push down SQL conditions
sudo echo 3 | sudo tee /proc/sys/vm/drop_caches
Optimizing For Shared Environment
fstat() is slow on network filesystem (GFS)
Request to render list of tasks in Acunote:on development box: 0.50 secon production box:0.50 - 2.50 sec
Optimizing For Shared Environment
fstat() is slow on network filesystem (GFS)Couldn't figure out why until we ran strace
We used a) filesystem store for fragment cachingb) expire_fragment(regexp)
Later looked through all cache directories even though we knew the cache is located in only one specific subdir
Optimizing For Shared Environment
fstat() is slow on network filesystem (GFS)Solution:memcached instead of filesystem
if filesystem is ok, here's a trick:http://blog.pluron.com/2008/07/hell-is-paved-w.html
Things To Optimize
DevelopmentRuby code
Rails code
Database queries
Alternative Ruby
ProductionShared filesystems and databases
Live debugging
Load balancing
FrontendHTTP
Javascript
Internet Explorer
Live Debugging
To see what's wrong on "live" application:For Linux: strace and oprofileFor Mac and Solaris: dtraceFor Windows: uhm... about time to switch ;)
To monitor for known problems:monitnagiosown scripts to analyze application logs
Things To Optimize
DevelopmentRuby code
Rails code
Database queries
Alternative Ruby
ProductionShared filesystems and databases
Live debugging
Load balancing
FrontendHTTP
Javascript
Internet Explorer
Load Balancing
The problem of round-robin and fair load balancing
Rails App 1Rails App 2Rails App 31321
3
per-process queues
32
12
Load Balancing
The problem of round-robin and fair load balancing
Rails App 1Rails App 2Rails App 311321
3
per-process queues
22
Load Balancing
Solution: the global queue
Rails App 1Rails App 2Rails App 32145
3
mod_rails / Passenger
Load Balancing
Dedicated queues for long-running requests
Rails App 1Rails App 2Rails App 31121
3
queue for long-running requests
2
regular per-process queues
nginx dedicated queues
Load Balancing
nginx configuration for dedicated queues
upstream mongrel { server 127.0.0.1:5000; server 127.0.0.1:5001;}upstream rss_mongrel { server 127.0.0.1:5002;}server { location / { location ~ ^/feeds/(rss|atom) { if (!-f $request_filename) { proxy_pass http://rss_mongrel; break; } } if (!-f $request_filename) { proxy_pass http://mongrel; } }}
Things To Optimize
DevelopmentRuby code
Rails code
Database queries
Alternative Ruby
ProductionShared filesystems and databases
Live debugging
Load balancing
FrontendHTTP
Javascript
Internet Explorer
Optimize For The User: HTTP
Network and FrontendBackendThings to consider:Gzip HTML, CSS and JS
Minify JS
Collect JS and CSS
(javascript_include_tag :all, :cache => true)
Far future expires headers for JS, CSS, images
Sprites
Cache-Control: public
everything else YSlow tells you
5%
95%
Things To Optimize
DevelopmentRuby code
Rails code
Database queries
Alternative Ruby
ProductionShared filesystems and databases
Live debugging
Load balancing
FrontendHTTP
Javascript
Internet Explorer
Optimize Frontend: Javascript
Things you don't want to hear from your users:
"...Your server is slow..."
said the user after clickingon the link to show a formwith plain javascript (no AJAX)
Optimize Frontend: Javascript
Known hotspots in Javascript:- eval()- all DOM operations - avoid if possible, for example- use element.className instead of element.readAttribute('class')- use element.id instead of element.readAttirbute('id')- $$() selectors, especially attribute selectors- may be expensive, measure first- $$('#some .listing td a.popup[accesslink]' - use getElementsByTagName() and iterate results instead- element.style.* changes- change class instead- $() and getElementById on large (~20000 elements) pages
Things To Optimize
DevelopmentRuby code
Rails code
Database queries
Alternative Ruby
ProductionShared filesystems and databases
Live debugging
Load balancing
FrontendHTTP
Javascript
Internet Explorer
Optimize Frontend: IE
Slow things that are especially slow in IE:- $() and $$(), even on small pages- getElementsByName()- style switching
Optimize Frontend: IE
Good things about IE:
profiler in IE8fast in IE => fast everywhere else!
Keep It Fast!
So, you've optimized your application.How to keep it fast?
Keep It Fast!
Measure, measure and measure...Use profilerOptimize CPU and MemoryPerformance Regression Tests
Keep It Fast: Measure
Keep a set of benchmarks for most frequent user requests.For example:
Benchmark Burndown 120 0.70 0.00Benchmark Inc. Burndown 120 0.92 0.01Benchmark Sprint 20 x (1+5) (C) 0.45 0.00Benchmark Issues 100 (C) 0.34 0.00Benchmark Prediction 120 0.56 0.00Benchmark Progress 120 0.23 0.00Benchmark Sprint 20 x (1+5) 0.93 0.00Benchmark Timeline 5x100 0.11 0.00Benchmark Signup 0.77 0.00Benchmark Export 0.20 0.00Benchmark Move Here 20/120 0.89 0.00Benchmark Order By User 0.98 0.00Benchmark Set Field (EP) 0.21 0.00Benchmark Task Create + Tag 0.23 0.00 ... 30 more ...
Keep It Fast: Measure
Benchmarks as a special kind of tests:
class RenderingTest < ActionController::IntegrationTest def test_sprint_rendering login_with users(:user), "user"
benchmark :title => "Sprint 20 x (1+5) (C)", :route => "projects/1/sprints/3/show", :assert_template => "tasks/index" end
end
Benchmark Sprint 20 x (1+5) (C) 0.45 0.00
Keep It Fast: Measure
Benchmarks as a special kind of tests:
def benchmark(options = {})(0..100).each do |i|GC.startpid = fork dobeginout = File.open("values", "a")ActiveRecord::Base.transaction doelapsed_time = Benchmark::realtime dorequest_method = options[:post] ? :post : :getsend(request_method, options[:route])endout.puts elapsed_time if i > 0out.closeraise CustomTransactionErrorendrescue CustomTransactionErrorexitendendProcess::waitpid pidActiveRecord::Base.connection.reconnect!endvalues = File.read("values")print "#{mean(values).to_02f} #{sigma(values).to_02f}\n"end
Keep It Fast: Query Testing
Losing 10ms in benchmark might seem OK
Except that it's sometimes not because you're running one more SQL query
Keep It Fast: Query Testing
def test_queriesqueries = track_queries doget :indexendassert_equal queries, ["Foo Load","Bar Load","Event Create"]end
Keep It Fast: Query Testing
module ActiveSupportclass BufferedLogger
attr_reader :tracked_queries
def tracking=(val) @tracked_queries = [] @tracking = val end
def debug_with_tracking(message) @tracked_queries