If you can't read please download the document
Upload
mike-willbanks
View
1.350
Download
1
Embed Size (px)
DESCRIPTION
An overall presentation on scaling out your system starting from a single server and many of the several options you may face.
Citation preview
2. Scalability?
3. Web Servers 4. Database Servers 5. Cache Servers
7. CDN Servers 8. Front-End Performance 9. The Beginning...
How we know it's time
10. The Next Step...
However, we can't handle our current I/O, CPU or amount of requests on our web server. 11. Load Balancing 12. Load Balancing Our Environment 13. Several Options
Software Based (Commodity Server Cost)
Hardware Based (High Cost Appliance)
14. Routing Types of Load Balancers
15. Static 16. Least Connections 17. Source 18. IP 19. Basic Authentication
20. URI Parameter 21. Header 22. Cookie 23. Regular Expression 24. Open Source Software Options
25. Pound Said to be great for medium traffic sites. 26. Varnish A caching solution that also does load balancing 27. HAProxy
28. Very well known 29. Handles just about every type of routing 30. Several examples online 31. Has a web-based GUI Cons
32. Setup can be complex and take a lot of time 33. Sample HAProxy Configuration global log 127.0.0.1local0 log 127.0.0.1local1 notice maxconn 4096 user haproxy group haproxy daemon defaults logglobal modehttp optionhttplog optiondontlognull retries3option redispatch maxconn2000 contimeout5000 clitimeout50000 srvtimeout50000 listenlocalhost 0.0.0.0:80 option httpchk GET / balanceroundrobin cookie SERVERID serverserv1 0.0.0.0:8080 check inter 2000 rise 2 fall 5 serverserv2 0.0.0.0:8080 check inter 2000 rise 2 fall 5 option httpclose stats enable stats uri /lb?stats stats realm haproxy stats auth test:test 34. Pound
35. Native SSL support 36. Insanely simple setup 37. Supports virtually all types of routing 38. Many online tutorials Cons
39. Setup can be complex and take a lot of time 40. Sample Pound Configuration User"www-data" Group"www-data" LogLevel1Alive30 Control "/var/run/pound/poundctl.socket" ListenHTTP Address 127.0.0.1 Port80xHTTP0 Service BackEnd Address 127.0.0.1 Port8080 EndBackEnd Address 127.0.0.1 Port8080 EndEndEnd 41. Varnish
42. Farily simple setup 43. Extremely well known 44. Many online tutorials 45. Large suite of tools (varnishstat, varnishtop, varnishlog, varnishreplay, varnishncsa) Cons
46. If you want a WebGUI you must PAY 47. Sample Varnish Configuration backend default1 { .host = "127.0.0.1"; .port = "8080"; .probe = {.url = "/"; .interval = 5s;.timeout = 1s;.window = 5; .threshold = 3; }} backend default2 { .host = "127.0.0.1"; .port = "8080"; .probe = {.url = "/"; .interval = 5s;.timeout = 1s;.window = 5; .threshold = 3; }} director default round-robin { {.backend = default1; }{.backend = default2; }} sub vcl_recv { if (req.http.host ~ "^127.0.0.1$") { set req.backend = default; }} 48. What We Need to Remember
49. Don't use SSL on the web server level! Headers
50. Client IP is likely on X-forwarded-for 51. If using Virtual Hosts pass the Host Sessions
52. Web Servers 53. Several Options
54. IIS 55. Nginx 56. Lighttpd 57. etc. 58. Configuration
Each configuration SHOULD or MUST be the same. 59. Client IP will likely be in X-forwarded-for. 60. SSL will not be in $_SERVER['HTTPS'] and HTTP_ header instead. 61. What We Need to Remember
62. Static content could be tagged in version control. 63. Static content may need a file server / CDN / etc. 64. User Generated content on NFS mount or served from the cloud or a CDN. Sessions
65. Remember disk is slow and the database will be a bottleneck.How about distributed caching? 66. Other Thoughts
67. Database Servers 68. Where We All Start
69. Replication
70. Multiple Slaves
71. Multiple Masters
72. Be warned, auto-incrementing now should change so you do not conflict. 73. Partitioning
Horizontal Partitioning
74. What We Need to Remember
75. All reports / read queries should go here 76. Don't read here directly after a write
Sessions
77. Cache Servers (not full page) 78. Caching
80. Not highly scalable, great for configuration files. Distributed
81. Setup consistent hashing. Do not cache what cannot be re-created. 82. Caching
83. Start to cache fetches, invalidate cache on write and write new cache, always reading from the cache. 84. Distributed Caching
85. Server depends on the hash. 86. Hint use the memcached pecl extension. 87. The Read / Write Process
88. What We Need to Remember
89. Elasticity
Sessions
Memory Caches
90. Ensure dedicated memory! 91. If you run out of memory, does it remove an old and add the new or not allow anything to come in? 92. Job Servers 93. Message queues and mailboxes are software-engineering components used for interprocess communication, or for inter-thread communication within the same process. They use a queue for messaging the passing of control or of content. http://en.wikipedia.org/wiki/Message_queue 94. Messages are Everywhere 95. What are Message Queues
96. Asynchronous push / pull 97. An application framework for sending and receiving messages. 98. A way to communicate between applications / systems. 99. A way to decouple components. 100. A way to offload work. 101. Where We All Start
Queue Receive Producer Message Queue Server Consumer 102. Distributed Job Servers
103. Can continue to create more workers Producer Message Queue Server Consumer Consumer Consumer Consumer Consumer Message Queue Server Message Queue Server Producer Producer 104. Why are Message Queues Useful?
105. Communication between Applications / Systems 106. Image Resizing 107. Video Processing 108. Sending out Emails 109. Auto-Scaling Virtual Instances 110. Log Analysis 111. The list goes on... 112. What We Need to Remember
113. You need to keep your workers running
Don't offload things just to offload
114. DNS Servers 115. What to do
Anycast DNS
116. It's sexy, it's sweet and it is FAST! 117. A cheaper provider is DNS Made Easy.
118. What to look for...
119. Failover / Distributed 120. CNAME support 121. TXT support 122. Name Server support 123. CDN Servers 124. Why Use a CDN
125. Free your server from serving basic files 126. Distributed servers around the globe 127. What you need to know
PoP Pull
128. What's the best?
129. Origin Pull is great if you want to maintain all of the content in your web server. 130. PoP Push is great for storing things like user generated content. 131. Front-End Performance 132. Discussion Points
133. CSS Sprites 134. GZIP 135. Cookies are evil 136. Parallel downloads (using subdomains for serving 137. HTTP Expires 138. Discussion Points
139. Firebug 140. Google Page Speed 141. Google Webmaster Tools 142. Mike Willbanks Blog:http://blog.digitalstruct.com Twitter : mwillbanks IRC : lubs on freenode Questions?