Why measure?boomerang
data data data
Measuring the web with boomerang
Philip Tellis / [email protected]
NY Performance Meetup / 2010-09-15
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
$ finger philip
Philip [email protected]
@bluesmoonyahoogeekhttp://bluesmoon.info/
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Where does all the time go?
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Who controls it?
Some of this we control and some of it we don’t
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Back end
Measuring and improving back end performance can be doneduring development
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
80-20
Turns out that less than 20% of the time is spent on the backend
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Front end
It’s what we can’t control that bites us
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
browsers
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
plugins
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
OSes
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
viruses
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
antiviruses
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
microwaves
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
baby monitors
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
naughty neighbours
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
file shares
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
governments
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
rodents
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Too many variations
Try simulating all that in the lab!
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
We need to measure real end-user performance
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
We need to measure it from the real end-user’s box
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Ask the user?
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Bias
While this might work, it isn’t necessarily representative
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
A/B testing
You also want to be able to dynamically tune which users getwhich tests
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Phone home
It’s most useful if you can send these measurements back toyour server for analysis
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Mostly ubiquitous
We know that javascript is available on almost every browser
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Rich pages
We really want to measure the performance of rich pageswhich depend on javascript already
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
The slow webMeasurementsMeasuring with javascript
Limited
But javascript can’t measure everything... we get as close aswe can
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
A piece of javascript that you add to your web page where itmeasures and beacons back to you, the end user’s perceivedperformance of your page
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
How?
<script src="boomerang.js" type="text/javascript"></script><script type="text/javascript">BOOMR.init({
user_ip: "<user’s ip address>",beacon_url: "http://yoursite.com/beacon.php"
});</script>
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
What does it do?
About once a week, measures user’s bandwidth andlatency to your serverOn (almost) every request, measures the time it took toload the current pageBeacons these results back to your serverOther stuff based on plugins
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
How does it do it?
Let’s take that one at a time
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
How do we measure latency?
Download a 32 byte gif 10 times in sequenceMeasure the time to download eachDiscard the first measurement because it’s overpricedCalculate the arithmetic mean, standard deviation andmargin of error of the rest
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
Wait, did you say overpriced?
The first image might require a DNS lookup and TCPhandshakeSlow start is not an issue since 32 bytes fits in 1 packet
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
How do we measure bandwidth?
After the latency test is done, we download progressivelylarger imagesStop at the first image that times outRedownload that image a few more timesCalculate the median, standard deviation and margin oferror of the largest images
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
Measuring latency before bandwidth helps here
Those 10 latency images do a lot to widen the TCPwindow sizeThe bandwidth images make much better use of bandwidthThe image we end with uses the most bandwidth
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
How do we measure page load time?
In the onbeforeunload event, measure the time andstore it in a cookieIn the onload event, check the cookie, and measure thedifference with the current timeWe also make sure that the page that set the cookie is thereferrer of the current page
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
What? Two pages?
Yes, this needs two pages and cookies. If those aren’tsupported, we try to use the WebTiming API.
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
What?How does it work?Accuracy
How accurate is it?
Latency measurements are very accurate (±1%)Bandwidth is to an order of magnitude. For badconnections can be ±30%
Page load time sometimes has outliers, you needpost-filteringThe margin of error tells you how good your data is
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
What do we do with the data?
Sanity checking to:Remove fake dataRemove abusive dataMaybe just rate limiting
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
What do we do with the data?
Statistical analysis to:Remove outliersAggregate based on bandwidth blocksMeasure trends over time and correlate them with codechanges
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
Bandwidth blocks
0-100 kbps100-300 kbps300-2000 kbps2-6 Mbps6+ Mbps
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
Bandwidth blocks
Group page load times based on bandwidth block
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
Bandwidth blocks
Data points from some countries may require narrower bands
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
Geographic data
Looking at latency from different geographic locations can tellyou where to put your next CDN
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
ISPs
Grouping data by ISP can tell you who’s behaving badly
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
Storing the data
We log all beacon requests to apache’s log access fileLow traffic sites could write directly to a DBOthers have suggested using CouchDB as the beaconserverDaily summaries can be sent across to ShowSlow
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
More data
Write plugins to get more performance dataWe already have a DNS pluginI’m thinking of an IPv6 v/s IPv4 pluginWhat about a full WebTiming plugin?Can we measure connection setup time?
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
You decide
Once you have the data, you can do anything with it
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
Thank youhttp://github.com/yahoo/boomerang
http://yahoo.github.com/boomerang/doc/
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
Photo credits
flickr.com/photos/21233184@N02/4389412851
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
Contact me
Philip Tellisyahoogeek@bluesmoonhttp://bluesmoon.info/slideshare.net/[email protected]
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Why measure?boomerang
data data data
FilteringGroupingData
References
github.com/yahoo/boomerangMore bandwidth doesn’t matter (much) – Mike BelsheAnalysing Bandwidth & Latency – YUI BlogIt’s the latency, stupid – Stuart CheshireThe statistics of web performance
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
Recommended