36
CSC 2720 Building Web Applications Improving Web Application Performance

CSC 2720 Building Web Applications Improving Web Application Performance

Embed Size (px)

Citation preview

CSC 2720Building Web Applications

Improving Web Application Performance

Outline Introduction Locating the bottleneck Improvement Methods

Introduction To improve the performance of a web application

could mean To reduce latency

i.e., to reduce the time delay between sending a request and receiving the corresponding response, or between sending a request and receiving all the page components, such as images, that are needed to render the page.

To serve as many concurrent requests as possible without failing or exceeding a response time limit Response time – the time a server takes to serve a

request

PHP Performance Optimization Obtaining good performance is not merely writing

fast PHP scripts. High performance PHP requires a good understanding of the underlying hardware, the operating system and supporting software such as the web server and database. Source: http://phplens.com/lens/php-book/optimizing-

debugging-php.php

Often involves trade-off among CPU, storage, bandwidth and other resource requirements

Factors that affect the performance Running out of resources

Processors Memory Storage Network bandwidth # of the maximum database connections

Poorly designed database schema and queries Poorly written PHP code Too much disk access

Locating the Bottlenecks – Profiling To measure the behavior of a server-side script as it execut

es, particularly frequency and duration of function calls (Ref: Wikipedia)

Help you detect which parts of your code are worth your attention (e.g., functions that are called the most, functions that take a long time to run)

Can be used to analyze the performance of a database indirectly by measuring the functions that interact with the database (e.g., The mysqli_* functions)

e.g. of Profiling Tools XDebug: http://www.xdebug.org/

Tests a web application for its robustness, availability, and error handling capabilities under a heavy load, particularly to ensure the software doesn't crash in conditions of insufficient computational resources (such as memory or disk space), unusually high concurrency, or denial of service attacks. (Ref: Wikipedia)

Use in conjunction with other tools to find out The maximum # of requests/users a server can handle

before failing or slowing down significantly The bottleneck (i.e., which resource run out first?) The average time a page and its components to fully

load.

Locating the Bottlenecks – Stress Tests

Locating the Bottlenecks – Stress Tests

Microsoft Web Application Stress Tool Can emulate HTTP requests (with parameters)

from clients

Can generate large number of requests within a short period of time (stress test)

Can help you gather info such as the average and the highest latency experienced from accessing a URL

Locating the Bottlenecks – System Info Show the usage and availability of various system

resources (e.g., CPU, memory, virtual memory, I/O). The usage can be the overall usage or usage by individual processes.

Use in conjunction with stress tests tools to locate the bottleneck of a server

e.g. of tools "Task Manager" on Windows OS The command "tops" on most Unix/Linux OS

Improvement MethodsSmall changes that could make a big difference.

1. PHP accelerators (Opcode compiler + Opcode cache)

2. Content caching Utilizing client's and proxy's cache Cache output

3. Server-Side web proxy

4. Compression

5. Connection pooling

Improvement Methods

Fine tuning your web applications, servers, and OS

6. Reducing number of HTTP requests

7. Query optimization

8. Optimizing PHP code

9. Additional methods that make a page loads faster

10.Tuning the Server (Apache)

1. PHP Accelerators Typically made up of an opcode compiler and an

opcode cache Opcode compiler – compiles PHP code into opcode Opcode cache – keeps frequently used compiled PHP

scripts in memory

A PHP accelerator can help reducing the response time of PHP scripts significantly because Interpreting opcode is faster than interpreting PHP code Loading opcode from memory is faster than loading PHP

scripts from disk

List of PHP accelerators: http://en.wikipedia.org/wiki/PHP_accelerator

1.1 How PHP Accelerators Work

Source: http://phplens.com/lens/php-book/optimizing-debugging-php.php

2.1 Content Caching – Utilizing Client and Proxy Caches

Request clients/proxies to cache reusable components (e.g., images, scripts and stylesheets) in order to avoid retransmitting the same components

ProxyClient A

Client B

Server index.php

x.jpg

y.css

To illustrate, suppose • Clients A and B share a proxy server.• The HTML page generated by index.php needs x.jpg and y.css.• Only x.jpg and y.css are cacheable.

ProxyClient A

Client B

Server

ProxyClient A

Client B

Server

The 1st time client A requests index.php from the server, all three files need to be transferred from the server.

After the request, x.jpg and y.css are chached in the client A's and proxy's cache.

index.php

x.jpg

y.css

index.php

x.jpg

y.css

ProxyClient A

Client B

Server

If client B accesses index.php after client A has accessed the same file, then client B could load the page faster because the proxy server only needs to retrieve the HTML content from the server.

In practice, there could be more than one proxy servers between the clients and the server.

ProxyClient A

Client B

Server

In subsequent requests, client A only needs to download the HTML page generated by index.php.

index.php

x.jpg

y.css

index.php

x.jpg

y.css

2.1 Content Caching – Utilizing Client and Proxy Caches

Use the Expires or Cache-Control header fields to tell the clients and proxies how a component should be cached

e.g., When should the component be considered as expired? Is the component cacheable?

Recommendations For static components: implement a "Never expire" policy by setting

far future Expires header.

For dynamic components: use an appropriate Cache-Control header to help clients with conditional requests.

2.1 Content Caching – Utilizing Client and Proxy Caches

For examples To indicate that a component expires on a fixed time and date

Expires: Sat, 11 Apr 2009 20:00:00 GMT

To indicate that a component expires in one hour (relative to the access time) and the client must revalidate the content with the server when the component becomes stale

Cache-Control: max-age=3600, must-revalidate

To set caching policy for static components, you can configure the web server. To set up expiration policy for different files with Apache, see

module mod_expires or these examples. To set up default header values for different files with Apache, see

module mod_headers or these examples.

2.1 Content Caching – Utilizing Client and Proxy Caches

References and Reading Materials Caching Tutorial

Contains specific info about how (and how not) to cache http://www.mnot.net/cache_docs/

Working with cached pages in PHP http://www.badpenguin.org/docs/php-cache.html

HTTP conditional requests in PHP http://alexandre.alapetite.net/doc-alex/php-http-304/index.en.htm

l

Use Server Cache Control to Improve Performance http://www.websiteoptimization.com/speed/tweak/cache/

2.2. Content Caching – Reusing Generated Output

If a script only generates new content periodically, cache the generated output to avoid executing code and querying database for every request.

Examples of cacheable output List of high scores for an online game List of products on an e-commerce website in which the products

are updated daily

PHP examples about output caching Caching output in PHP

http://www.addedbytes.com/php/caching-output-in-php/

Output Caching with PHP http://www.devshed.com/c/a/PHP/Output-Caching-with-PHP/

3. Server-Side Web Proxy Use a web proxy at the server side to relieve the

web server from serving frequently requested static files

An example of web proxy: Squid http://www.squid-cache.org/

4. Compression Reduce the data size before transmitting

1. (Online) Use HTTP Compression – Compress textual data on the fly before sending them to a client

Can typically reduce the size of textual data by 70%

2. (Offline) Use compression tools to reduce the file size of JavaScript, CSS, Images, Video, etc.

The compression tools must not change the file format or the content of these files. Otherwise the files cannot be referred from HTML files.

e.g., use optipng for PNGs, gifsicle for GIFs and jpegoptim for JPGs

4.1. Compression – HTTP Compression

A publicly defined way to compress textual content transferred from web servers to browsers

Compression is done at the server.

Built into HTTP 1.1 and is supported by most browsers

Drawback: Takes time and CPU cycles to compress

Ref: http://www.websiteoptimization.com/speed/tweak/compress/

Using HTTP Compression in PHP Configure php.ini to enable automatic HTTP compression

zlib runtime configuration (http://hk2.php.net/manual/en/zlib.configuration.php)

Perform HTTP compression in PHP scripts programmatically Examples of using ob_gzhandler (http://hk2.php.net/ob_gzhandler)

5. Connection Pooling (Why?) A database connection incurs overhead – it requires

resources to create the connection, authenticate it, maintain it, and then release it when it is no longer required.

The overhead is particularly high for Web-based applications. A server-side script typically opens a connection, performs few

queries, and then close the connection. Often, more effort is spent connecting and disconnecting than is

spent during the interactions themselves.

Ref: IBM WebSphere App. Server – What is Connection Pooling?

5. Connection Pooling A connection pool is a cache of opened database

connections.

When a script needs to establish a connection to the database, a connection is selected from the pool if one is available. Otherwise a new connection is created.

When a script closes the connection, the connection is not actually closed but returned to the pool so that the connection can be reused by other scripts.

Note: Implementing a connection pool is not easy. Usually we just use it if it is available.

5.1. Connection Pooling in PHP PHP's Persistent Connection

Use mysql_pconnect() to open a persistent database connection There is no equivalence in MySQLImproved (mysqli)

extension.

Must be used with care because changes made to the database states, such as setting autocommit to "off", will affect the next script that uses the connection.

Other connection pooling solutions: SQL Relay: http://sqlrelay.sourceforge.net/index.html Apache Module mod_dbd:

http://httpd.apache.org/docs/2.1/mod/mod_dbd.html

6. Reducing # of HTTP Requests (Why?) A large portion of the total response time to create

a fully rendered page is spent on downloading the page components like images, stylesheets, JavaScript, etc.

Some browsers only allow at most two concurrent requests per server. That means the page components have to take turn to load.

Ref: http://developer.yahoo.com/performance/rules.html

6. Reducing # of HTTP Requests Reducing # of components reducing # of HTTP

requests Page loads faster

Methods to reduce the # of page components Combine multiple stylesheets into one

Combine multiple scripts into one

CSS Sprites – Tile multiple images into one image and then make use CSS to clip the needed image from the combined image.

7. Query Optimization Tune DB Schema

First three normal forms help ensure data integrity Denormalization – a process that attempts to optimize the

performance of a database by adding redundant data or by grouping data (but makes maintaining data integrity difficult).

Query only what you really need e.g., instead of using "SELECT *", select only the columns you need

and use LIMIT to limit the number of rows retrieved from the DB

Make use of indexes to improve the performance of data retrieval

Take a database course …

Ref: How to Optimize Queries (Theory an Practice) http://www.serverwatch.com/tutorials/article.php/2175621

8. Optimizing PHP Code Make use of output buffer

See PHP Output Bufering Control: http://hk2.php.net/ob_start

<?php ob_start(); // Start output buffering // All the output are kept in memory instead // of sending to the client.?><html><head><title>Foo</title></head><body><?php echo "Blah Blah Blah"; ?></body></html><?php ob_end_flush(); // Flush everything in the output buffer // to the client at once.?>

8. Optimizing PHP Code

echo $str1, " ", $str2, " ", $str3;

executes faster than

echo $str1 . " " . $str2 . " " . $str3;

Note: This only works with echo, which is a function that can take several strings as arguments.

If you have CPU-intensive tasks to perform, consider implement them as C extensions.

Use the predefined functions (as oppose to writing your own functions) whenever possible

8. Optimizing PHP Code

Instead of writingfor ($i=0; $i < count($array); $i++)

Use a variable to store the array size and rewrite the loop as

$array_size = count($array);for ($i=0; $i < $array_size; $i++)

More tips about optimizing PHP code can be found at http://reinholdweber.com/?p=3

9. Additional Methods that Make a Page Loads Faster

Post-load Components Load the less important components on the background

Preload Components Anticipating what components are needed in the future and pre-load

them (i.e., utilizing browser's idle time)

Split Components Across Domains Maximize parallel downloads (a browser may only issue a few HTT

P request in parallel to the same sever) Make sure you're using not more than 2-4 domains because of the

DNS lookup penalty.

Ref: Best Practices for Speeding Up Your Web Site (http://developer.yahoo.com/performance/rules.html)

10. Tuning the Server (Apache) SendBufferSize – Size of output buffer

MaxClients – Maximum # of clients

StartServers – The number of child processes to create at start up

MinSpareServers, MaxSpareServers – The number of idle child processes to keep alive

Keep-alive – tells the server to reuse the same socket connection for multiple HTTP requests to reduce the overhead of frequent connects

Source: http://phplens.com/lens/php-book/optimizing-debugging-php.php

Scaling To improve the performance by introducing more

machines (to host more servers) Server Clusters

Database Replication Improve performance or availability of the whole database syste

m MySQL Replication

(http://dev.mysql.com/doc/refman/5.0/en/replication.html)

References Best Practices for Speeding Up Your Web Site

http://developer.yahoo.com/performance/rules.html

Performance Research, Part 1: What the 80/20 Rule Tells Us about Reducing HTTP Requests http://yuiblog.com/blog/2006/11/28/performance-research-part-1/

Performance Research, Part 2: Browser Cache Usage - Exposed! http://yuiblog.com/blog/2007/01/04/performance-research-part-2/

Practical PHP Performance http://www.developertutorials.com/tutorials/php/practical-php-perfor

mance-8-02-07/page3.html