Adapting to Network and Client Variation Using Active Proxies

Adapting to Network and Client Variation Using Active Proxies:Lessons and PerspectivesArmando Fox Steven D. Gribble Yatin ChawatheEric A. Brewer fox,gribble,yatin,[email protected]'s Internet clients vary widely with re-spect to both hardware and software properties:screen size, color depth, e�ective bandwidth, pro-cessing power, and the ability to handle di�erentdata formats. The order-of-magnitude span of thisvariation is too large to hide at the network level,making application-level techniques necessary. Wepresent evidence that on-the- y adaptation by ac-tive transformational proxies is a widely applicable,cost-e�ective, and exible technique for addressingall these types of variation. To support this claim,we describe our experience with datatype-speci�cdistillation (lossy compression) in a variety of ap-plications. We also argue that there are impor-tant advantages (such as incremental deploymentand amortization of cost) to placing adaptation ma-chinery in the network infrastructure rather thaninserting it into end servers. To this end, we de-scribe a programming model for large-scale interac-tive Internet services and a scalable cluster-basedframework that has been in production use at UCBerkeley since April 1997. We present a detailed ex-amination of TranSend, a scalable transformationalWeb proxy deployed on our cluster framework, andgive descriptions of several handheld-device appli-cations that push the proxy-adaptation philosophyto its limits.1 Infrastructural On-the-Fly Adap-tation Services1.1 Heterogeneity: Thin Clients andSlow NetworksThe current Internet infrastructure includes anextensive range and number of clients and servers.Clients vary along many axes, including screen size,color depth, e�ective bandwidth, processing power,and ability to handle speci�c data encodings, e.g.,GIF, PostScript, or MPEG. As shown in tables 1and 2, each type of variation often spans ordersof magnitude. High-volume devices such as smart

phones [12] and smart two-way pagers will soonconstitute an increasing fraction of Internet clients,making the variation even more pronounced.These conditions make it di�cult for servers toprovide a level of service that is appropriate for ev-ery client. Application-level adaptation is requiredto provide a meaningful Internet experience acrossthe range of client capabilities. Although we ex-pect clients to improve over time, there will alwaysbe older systems still in use that represent rela-tively obsolete clients, and the high end will ad-vance roughly in parallel with the low end, e�ec-tively maintaining a gap between the two: there willalways be a large di�erence between the very bestlaptop and the very best smart phone.Platform SPEC92/ Screen Bits/Memory Size pixelHigh-end PC 200/48M 1280x1024 24Midrange PC 160/8M 1024x768 16Laptop 110/16M 800x600 8Typical PDA low/2M 320x200 2Table 1: Physical variation among clientsNetwork Bandwidth Round-Trip(bits/s) TimeLocal Ethernet 10-100 M 0.5 - 2.0 msISDN 128 K 10-20 msWireline Modem 14.4 - 28.8 K 350 msCellular/CDPD 9.6 - 19.2 K 0.1 - 0.5 sTable 2: Typical Network Variation1.2 Approach: Infrastructural ProxyServicesWe argue for a proxy-based approach to adapta-tion, in which proxy agents placed between clientsand servers perform aggressive computation andstorage on behalf of clients. The proxy ap-1

proach stands in contrast to the client-based ap-proach, which attempts to bring all clients up toa least-common-denominator level of functionality(e.g. text-only, HTML-subset compatibility forthin-client Web browsers), and the server-based ap-proach, which attempts to insert adaptation ma-chinery at each end server. We believe the proxyapproach directly confers three advantages over theclient and server approaches.� Leveraging the installed infrastructure.The enormous installed infrastructure, andits attendant base of existing content, is toovaluable to waste. A compelling solution tothe problem of client and network heterogene-ity should allow interoperability with existingservers, thus enabling incremental deploymentwhile content formats and protocols are tunedand standardized for di�erent target platforms.A proxy-based approach lends itself naturallyto the goal of transparent incremental deploy-ment, since an application-level proxy appearsas a server to existing clients and as a client toexisting servers.� Rapid prototyping during turbulentstandardization cycles. Software develop-ment on \Internet time" does not allow forlong deployment cycles. Proxy-based adapta-tion provides a smooth path for rapid proto-typing of new services, formats, and protocols,which can be standardized later if the proto-types succeed.� Economy of scale. Basic queueing theoryshows that a large central (virtual) server ismore e�cient in both cost and utilization thana collection of smaller servers; standalone desk-top systems represent the degenerate case ofone \server" per user. This supports the ar-gument for Network Computers [29] and sug-gest that collocating proxy services with infras-tructural elements such as Internet points-of-presence (POPs) is one way to achieve e�ectiveeconomies of scale.Large-scale network services remain di�cult todeploy because of three fundamental challenges:scalability, availability and cost e�ectiveness. Byscalability, we mean that when the load o�ered tothe service increases, an incremental and linear in-crease in hardware can maintain the same per-userlevel of service. By availability, we mean that theservice as a whole must be available 24� 7, despitetransient partial hardware or software failures. By

cost e�ectiveness, we mean that the service must beeconomical to administer and expand, even thoughit potentially comprises many workstation nodes.In section 3 we describe how we have addressedthese challenges in our cluster-based proxy appli-cation server architecture.1.3 Contributions and Map of PaperIn section 2 we describe our measurements andexperience with datatype-speci�c distillation and re-�nement, a mechanism that has been central to ourproxy-based approach to network and client adapta-tion. In section 3 we introduce a generalized \build-ing block" programming model for designing andimplementing adaptive applications, describe ourimplemented cluster-based application server thatinstantiates the model, and present detailed mea-surements of a particular production application:TranSend, a transformational Web proxy service. Insection 4 we present case studies of other services wehave built using our programming model, some ofwhich are in daily use by thousands of users, includ-ing the Top Gun Wingman graphical Web browserfor the USR PalmPilot handheld device. We dis-cuss related work in section 5, and attempt to drawsome lessons from our experience and guidelines forfuture research in section 6.2 Adaptation via Datatype Speci�cDistillationWe propose three design principles that we be-lieve are fundamental for addressing client variationmost e�ectively.1. Adapt to client variation via datatype-speci�c lossy compression.Datatype-speci�c lossy compression mecha-nisms can achieve much better compressionthan \generic" compressors, because they canmake intelligent decisions about what informa-tion to throw away based on the semantic typeof the data. For example, lossy compressionof an image requires discarding color informa-tion, high-frequency components, or pixel res-olution. Lossy compression of video can addi-tionally include frame rate reduction. Less ob-viously, lossy compression of formatted text re-quires discarding some formatting informationbut preserving the actual prose. In all cases,the goal is to preserve information that has thehighest semantic value. We refer to this process2

generically as distillation. A distilled object al-lows the user to decide whether it is worth ask-ing for a re�nement: for instance, zooming inon a section of a graphic or video frame, or ren-dering a particular page containing PostScripttext and �gures without having to render thepreceding pages.2. Perform adaptation on the y. To reapthe maximum bene�t from distillation and re-�nement, a distilled representation must targetspeci�c attributes of the client. Fortunately,it is often computationally feasible on today'shardware to compute a desired representationof a typed object on demand rather than re-lying on a set of precomputed representations.Our (non-intuitive) observation, resulting fromextensive implementation and measurement de-scribed in section 2.1, is that distillation time issmall in practice and more than compensatedby the resulting reduction in transmission timeover low-bandwidth links. We have successfullyimplemented useful distillation \workers" thatserve clients spanning an order of magnitude ineach area of variation, and we have generalizedour approach into a common framework, whichwe discuss in section 3.3. Move complexity away from both clientsand servers. Application partitioning argu-ments have long been used to keep clients sim-ple (as in [39]). However, by performing adap-tation at a shared infrastructural proxy, we canalso realize incremental deployment throughtransparent interposition, and avoid insertingadaptation machinery at each server, therebyimproving economy of scale. Therefore, on-demand distillation and re�nement should bedone at an intermediate proxy that has accessto substantial computing resources and is well-connected to the rest of the Internet.Table 3 lists the \axes" of compression corre-sponding to three important datatypes: formattedtext, images, and video streams. Of course thereare limits to how severe a degradation of quality ispossible before the source object becomes unrecog-nizable, but we have found that order-of-magnitudesize reductions are often possible without signi�-cantly compromising semantic usefulness.

Semantic Speci�c Distillation axesType encodings or quality levelsImage GIF, JPEG, Resolution,PPM, color depth,Postscript color paletteText Plain, HTML, Heavy formatting,Postscript, simple markup,PDF plain textVideo NV, H.261, Resolution, frameVQ, MPEG rate, color depth,progression limit(for progressiveencodings)Table 3: Three important types and the dis-tillation axes corresponding to each2.1 Performance of Distillation and Re-�nement On DemandWe now describe and evaluate datatype-speci�cdistillers for images and rich text.1 The goal ofthis section is to support our claim that in themajority of cases, end-to-end latency is reduced bydistillation, that is, the time to produce a usefuldistilled object on today's workstation hardware issmall enough to be more than compensated by thesavings in transmission time for the distilled objectrelative to the original.2.1.1 ImagesWe have implemented an image distiller called gif-munch, which implements distillation and re�ne-ment for GIF [26] images, and consists largely ofsource code from the NetPBM Toolkit [35]. Gif-munch makes simple predictions about the size ofits output by measuring the achieved bits per pixelcompression of the original image relative to a \raw"bitmap. Figure 1 shows the result of running gif-munch on a large color GIF image of the BerkeleyComputer Science Division's home building, SodaHall. The image of Figure 1a measures 320 � 200pixels|about 1/8 the total area of the original880 � 610|and uses 16 grays, making it suitablefor display on a typical handheld device.Due to the degradation of quality, the writing onthe building is unreadable, but the user can requesta re�nement of the subregion containing the writing,which can then be viewed at full resolution.Image distillation can be used to address all threeareas of client variation:1A distiller for real-time network video streams is de-scribed separately, in [1].3

Left (a) is a distilled image of Soda Hall, and above (b) illustrates refinement. (a) occupies 17 KB at 320x200 pixels in 16 grays, compared with the 492 KB, 880x600 pixel, 249 color original (not shown). The refinement (b) occupies 12 KB. Distillation took 6 seconds on a SPARCstation 20/71, and refinement took less than a second.Figure 1: Distillation example� Network variation: The graphs in �gure 2depict end-to-end client latency for retrievingthe original and each of four distilled versionsof a selection of GIF images: the top set of barsis for a cartoon found on a popular Web page,the middle set corresponds to a large photo-graphic image, and the bottom to a computerrendered image. The images were fetched usinga 14.4Kb/s modem with standard compression(V.42bis and MNP-5) through the UC BerkeleyPPP gateway, via a process that runs each im-age through gifmunch. Each bar is segmentedto show the distillation latency and transmis-sion latency separately. Clearly, even thoughdistillation adds latency at the proxy, it can re-sult in greatly reduced end-to-end latency. Thissupports design principle #2.� Hardware variation: A \map to 16 grays"operation would be appropriate for PDA-classclients with shallow grayscale displays. We canidentify this operation as an e�ective lossy com-pression technique precisely because we knowwe are operating on an image, regardless ofthe particular encoding, and the compressionachieved is signi�cantly better than the 2��4�typically achieved by \generic" lossless com-pression (design principle #1).� Software variation: Handheld devices suchas the USR PalmPilot frequently have built-insupport for proprietary image encodings only.The ability to convert to this format saves codespace and latency on the client (design principle#3).

Feature HTML Rich Post-Text ScriptDi�erent fonts Y Y YBold and Italics Y Y YPreserves Font Size head- Y YingsPreserves Paragraphs Y Y YPreserves Layout N Y YHandles Equations N some YPreserves Tables N Y YPreserves Graphs N N YTable 4: Features for postscript distillation2.1.2 Rich-TextWe have also implemented a rich-text distiller [34],which performs lossy compression of PostScript-encoded text. The distiller replaces PostScript for-matting information with HTML markup tags orwith a custom rich-text format that preserves theposition information of the words. PostScript is anexcellent target for a distiller because of its com-plexity and verbosity: both rendering and transmis-sion are resource intensive. Table 4 compares thefeatures available in each format. Figure 3 showsthe advantage of rich-text over PostScript for screenviewing. As with image distillation, PostScript dis-tillation yields advantages in all three categories ofclient variation:� Network variation: Again, distillation re-duces the required bandwidth and thus the end-to-end latency. We achieved an average sizereduction of 5� when going from compressedPostScript to compressed HTML. Second, the4

9.44.9

32

48

7.64.53.22.3

492

10.35.53.21.9

18.8

0 5 10 15 20 25 30 35 40Latency (sec)

Distill

Transmit

492

48

Figure 2: End-to-end latency for images with andwithout distillation. Each group of bars representsone image with 5 levels of distillation; the top barrepresents no distillation at all. The y-axis numberis the distilled size in kilobytes (so the top bar givesthe original size). Note that two of the undistilledimages are o� the scale; the Soda Hall image is o�by an order of magnitude.pages of a PostScript document are pipelinedthrough the distiller, so that the second pageis distilled while the user views the �rst page.In practice, users only experience the latencyof the �rst page, so the di�erence in perceivedlatency is about 8� for a 28.8K modem. Dis-tillation typically took about 5 seconds for the�rst page and about 2 seconds for subsequentpages.� Hardware variation: Distillation reduces de-coding time by delivering data in an easy-to-parse format, and results in better looking doc-uments on clients with lower quality displays.� Software variation: PostScript distilla-tion allows clients that do not directly sup-port PostScript, such as handhelds or Net-work Computers, to view these documents inHTML or our rich-text format. The rich-text viewer could be an external viewer simi-lar to ghostscript, a module for a Java-capablebrowser, or a browser plug-in rendering mod-ule.Overall, rich-text distillation reduces end-to-endlatency, results in more readable presentation, and

Figure 3: Screen snapshots of our rich-text (top)versus ghostview (bottom). The rich-text is easierto read because it uses screen fonts.adds new abilities to low-end clients, such asPostScript viewing. The latency for the appearanceof the �rst page was reduced an average of 8� usingthe proxy and PostScript distiller. Both HTML andour rich-text format are signi�cantly easier to readon screen than rendered PostScript, although theysacri�ce some layout and graphics accuracy com-pared to the original PostScript.2.2 SummaryHigh client variability is an area of increasing con-cern that existing servers do not handle well. Wehave proposed three design principles we believe tobe fundamental to addressing variation:� Datatype-speci�c distillation and re�nementachieve better compression than does losslesscompression, while retaining useful semanticcontent and allowing network resources to bemanaged at the application level.� On-demand distillation and re�nement reduceend-to-end latency perceived by the client(sometimes by almost an order of magnitude),are more exible than reliance on precomputedstatic representations, and give low-end clientsnew abilities such as PostScript viewing.5

� Performing distillation and re�nement in thenetwork infrastructure rather than at the end-points separates technical as well as economicconcerns of clients and servers.3 Scalable Internet ApplicationServersIn this section, we focus on the problem of deploy-ing adaptation-based proxy services to large com-munities (tens of thousands of users, representativeof the subscription size of a medium-sized ISP). Inparticular, we discuss a cluster-friendly program-ming model for building interactive and adaptiveInternet services, and measurements of our imple-mented prototype of a scalable, cluster-based serverthat instantiates the model. Our framework re ectsthe implementation of three real services in use to-day: TranSend, a scalable transformation proxy forthe 25,000 UC Berkeley dialup users (connectingthrough a bank of 600 modems); Top Gun Wing-man, the only graphical Web browser for the USRPalmPilot handheld device; and the Inktomi searchengine (commercialized as HotBot), which performsmillions of queries per day against a database of over50 million web pages. Although HotBot does notdemonstrate client adaptation, we use it to validateparticular design decisions in the implementationof our server platform, since it pioneered many ofthe cluster-based scalability techniques generalizedin our scalable server prototype.We focus our detailed discussion and measure-ments on TranSend, a transformational proxy ser-vice that performs on-the- y lossy image compres-sion. TranSend applies the ideas explored in thepreceding section to the World Wide Web.3.1 TACC: A Programming Model forInternet ServicesWe focus on a particular subset of Internet ser-vices, based on transformation (distillation, �lter-ing, format conversion, etc.), aggregation (collectingand collating data from various sources, as searchengines do), caching (both original and transformedcontent), and customization (maintenance of a per-user preferences database that allows workers to tai-lor their output to the user's needs or device char-acteristics).We refer to this model as TACC, from the initialsof the four elements above. In the TACC model,applications are built from building blocks intercon-nected with simple API's. Each building block, or

worker, specializes in a particular task, for example,scaling/dithering of images in a particular format,conversion between speci�c data formats, extracting\landmark" information from speci�c Web pages,etc. Complete applications are built by composingworkers; roughly speaking, one worker can chain toanother (similar to processes in a Unix pipeline), ora worker can call another as a subroutine or corou-tine. This model of composition results in a verygeneral programming model that subsumes trans-formation proxies [16], proxy �lters [42], customizedinformation aggregators, and search engines.A TACC server is a platform that instantiatesTACC workers, provides dispatch rules for routingnetwork data tra�c to and from them, and pro-vides support for the inter-worker calling and chain-ing API's. Similar to a Unix shell, a TACC serverprovides the mechanisms that insulate workers fromhaving to deal directly with low-level concerns suchas data routing and exception handling, and givesworkers a clean set of API's for communicatingwith each other, the caches, and the customizationdatabase (described below). We describe our proto-type implementation of a scalable, commodity-PCcluster-based TACC server later in this section.3.2 Cluster-Based TACC Server Archi-tectureWe observe that clusters of workstations havesome fundamental properties that can be exploitedto meet the requirements of a large-scale networkservices (scalability, high availability, and cost e�ec-tiveness). Using commodity PC's as the unit of scal-ing allows the service to ride the leading edge of thecost/performance curve; the inherent redundancy ofclusters can be used to mask transient failures; and\embarrassingly parallel" network service workloadsmap well onto networks of commodity workstations.However, developing cluster software and ad-ministering a running cluster remain complex. Aprimary contribution of our work is the design,analysis, and implementation of a layered frame-work for building adaptive network services that ad-dresses this complexity while realizing the sought-after economies of scale. New services can use thisframework as an o�-the-shelf solution to scalability,availability, and several other problems, and focusinstead on the content of the service being devel-oped.We now describe our proposed system architec-ture and service-programming model for buildingscalable TACC servers using clusters of PC's. Thearchitecture attempts to address the challenges of6

cluster computing (unwieldy administration, man-aging partial failures, and the lack of shared stateacross components) while exploiting the strengthsof cluster computing (support for incremental scal-ability, high availability through redundancy, andthe ability to use commodity building blocks). Amore detailed discussion of the architecture can befound in [24].The goal of our architecture is to separate thecontent of network services (i.e., what the servicesdo) from their implementation, by encapsulating the\scalable network service" (SNS) requirements ofhigh availability, scalability, and fault tolerance ina reusable layer with narrow interfaces. Applica-tion writers program to the TACC API's alluded toin the previous section, without regard to the un-derlying TACC server implementation; the resultingTACC applications automatically receive the bene-�ts of linear scaling, high availability, and failuremanagement when run on our cluster-based TACCserver.System Area

Wide-Area Netw ork

Manager

GraphicalMonitor

UserProfile DB

Network

$

$

$

Worker Pool

FE FEFEMS MS MS

W

WS WS WS

W W WorkerAPIFigure 4: Architecture of a cluster-basedTACC server: Components include front ends(FE), a pool of TACC workers (W) some of whichmay be caches ($), a user pro�le database, a graph-ical monitor, and a fault-tolerant load manager,whose functionality logically extends into the man-ager stubs (MS) and worker stubs (WS).The software-component block diagram of a scal-able TACC server is shown in �gure 4. Each physi-cal workstation in a network of workstations (NOW[2]) supports one or more software components inthe �gure, but each component in the diagram iscon�ned to one node. In general, the componentswhose tasks are naturally parallelizable are repli-

cated for scalability, fault tolerance, or both.Front Ends provide the interface to the TACCserver as seen by the outside world (e.g., HTTPserver). They \shepherd" incoming requests bymatching them up with the appropriate user pro-�le from the customization database, and queueingthem for service by one or more workers. Frontends maximize system throughput by maintainingstate for many simultaneous outstanding requests,and can be replicated for both scalability and avail-ability.The Worker Pool consists of caches (cur-rently Harvest [8]) and service-speci�c modules thatimplement the actual service (data transforma-tion/�ltering, content aggregation, etc.) Each typeof module may be instantiated zero or more times,depending on o�ered load. The TACC API allowsall cache workers to be managed as a single vir-tual cache, by providing URL hashing, automaticfailover, and dynamic growth of the cache pool.The Customization Database stores user pro-�les that allow mass customization of request pro-cessing.The Manager balances load across workersand spawns additional workers as o�ered load uc-tuates or faults occur. When necessary, it may as-sign work to machines in the over ow pool, a setof backup machines (perhaps on desktops) that canbe harnessed to handle load bursts and provide asmooth transition during incremental growth.The Load Balancing/Fault Tolerance man-ager keeps track of what workers are running where,autostarts new workers as needed, and balances loadacross workers. Its detailed operation is describedin section 3.3, in the context of the TranSend imple-mentation. Although it is a centralized agent, [24]describes the various mechanisms, including multi-cast heartbeat and process-peer fault tolerance, thatkeep this and other system components running andallow the system to survive transient componentfailures.The Graphical Monitor for system manage-ment supports asynchronous error noti�cation viaemail or pager, temporary disabling of system com-ponents for hot upgrades, and visualization of thesystem's behavior using Tcl/Tk [33]. The bene�tsof visualization are not just cosmetic: We can imme-diately detect by looking at the visualization panelwhat state the system as a whole is in, whether anycomponent is currently causing a bottleneck (suchas cache-miss time, distillation queueing delay, in-terconnect), what resources the system is using, andsimilar behaviors of interest.The System-Area Network provides a low-latency, high-bandwidth, scalable interconnect,7

such as switched 100-Mb/s Ethernet or Myrinet[32]. Its main goal is to prevent the interconnectfrom becoming the bottleneck as the system scales.Components in our TACC server architecturemay be replicated for fault tolerance or high avail-ability, but we also use replication to achieve scal-ability. When the o�ered load to the system satu-rates the capacity of some component class, moreinstances of that component can be launched on in-crementally added nodes. The duties of our repli-cated components are largely independent of eachother (because of the nature of the Internet services'workload), which means the amount of additionalresources required is a linear function of the increasein o�ered load.3.3 Analysis of the TranSend Imple-mentationTranSend [16], a TACC reimplementation of ourearlier prototype called Pythia [21], performs lossyWeb image compression on the y. Each TranSendworker handles compression or markup for a spe-ci�c MIME type; objects of unsupported types arepassed through to the user unaltered.2We took measurements of TranSend using a clus-ter of 15 Sun SPARC Ultra-1 workstations con-nected by 100 Mb/s switched Ethernet and isolatedfrom external load or network tra�c. For measure-ments requiring Internet access, the access was viaa 10 Mb/s switched Ethernet network connectingour workstation to the outside world. Many of theperformance tests are based upon HTTP trace datafrom the 25,000 UC Berkeley dialup IP users [28],played back using a high-performance playback en-gine of our own design that can either generate re-quests at a constant rate or faithfully play back atrace according to the timestamps in the trace �le.In the following subsections we report on exper-iments that stress TranSend's fault tolerance, re-sponsiveness, and scalability.3.3.1 Self Tuning and Load BalancingAs mentioned previously, the load balancing andfault tolerance manager is charged with spawningand reaping workers and distributing internal loadacross them. The mechanisms by which this is ac-complished, which include monitoring worker queuelengths and applying some simple hysteresis, are de-scribed in [24].2The PostScript-to-richtext worker described in section 2has not yet been added to TranSend.

0 50 100 150 200 250 3000

5

10

15

20

25

Time (seconds)

Distiller 1

Distiller 2

Distiller 3

Distiller 4

Distiller 5

Dis

tille

r Q

ueue

Len

gth

Distiller 1started

Distiller 4started

Distiller 5started

Distiller 3startedDistiller 2

started Distillers 1& 2 died

Figure 5: Worker queue lengths observed over timeas the load presented to the system uctuates, andas workers are manually brought down.Figure 5 shows the variation in worker queuelengths over time. The system was bootstrappedwith one front end and the manager, and a singledemand-spawned worker. Continuously increasingthe load caused the manager to spawn a second andlater a third worker. We then manually killed the�rst two workers; the sudden load increase on theremaining worker caused the manager to spawn oneand later another new worker, to stabilize the queuelengths.3.3.2 ScalabilityTo demonstrate the scalability of the system, weperformed the following experiment:1. We began with a minimal instance of the sys-tem: one front end, one worker, the manager,and a �xed number of cache partitions. (Sincefor these experiments we repeatedly requestedthe same subset of images, the cache was e�ec-tively not tested.)2. We increased the o�ered load until some sys-tem component saturated (e.g., worker queuesgrowing too long, front ends no longer accept-ing additional connections, etc.).3. We then added more resources to the systemto eliminate this saturation (in many cases thesystem does this automatically, as when it re-cruits over ow nodes to run more workers), andwe recorded the amount of resources added asa function of the increase in o�ered load, mea-sured in requests per second.4. We continued until the saturated resource couldnot be replenished (i.e., we ran out of hard-8

ware), or until adding more of the saturatedresource no longer resulted in a linear or close-to-linear improvement in performance.Req./ # # Element thatSecond FE's Wkrs. saturated0-24 1 1 workers25-47 1 2 workers48-72 1 3 workers73-87 1 4 FE Ethernet88-91 2 4 workers92-112 2 5 workers113-135 2 6 workers +FE Ethernet136-159 3 7 workersTable 5: Results of the scalability experiment. \FE"refers to front end.Table 5 presents the results of this experiment.At 24 requests per second, as the o�ered load ex-ceeded the capacity of the single available worker,the manager automatically spawned one additionalworker, and then subsequent workers as necessary.At 87 requests per second, the Ethernet segmentleading into the front end saturated, requiring a newfront end to be spawned. We were unable to test thesystem at rates higher than 159 requests per second,as all of our cluster's machines were hosting work-ers, front ends, or playback engines. We did observenearly perfectly linear growth of the system over thescaled range: a worker can handle approximately 23requests per second, and a 100 Mb/s Ethernet seg-ment into a front-end can handle approximately 70requests per second. We were unable to saturate thefront end, the cache partitions, or fully saturate theinterior SAN during this experiment. We draw twoconclusions from this result:� Even with a commodity 100 Mb/s SAN, linearscaling is limited primarily by bandwidth intothe system rather than bandwidth inside thesystem.� Although we run TranSend on four SPARC10's, a single Ultra-1 class machine would suf-�ce to serve the entire dialup IP population ofUC Berkeley (25,000 users o�cially, over 8000of whom surfed during the trace).4 Other TACC ApplicationsWe now discuss several examples of new servicesin various stages of deployment, showing how each

maps onto the TACC model and discussing someof our experiences with the applications. Ratherthan providing detailed measurements as we did forTranSend in the previous section, the present goalis to demonstrate the exibility of the TACC frame-work in accommodating an interesting range of ap-plications, while providing consistent guidelines forapproaching application partitioning decisions.We restrict our discussion here to services thatcan be implemented using the proxy model, i.e.,transparent interposition of computation betweenclients and servers. (Some of our services do notcommunicate via HTTP but are conceptually sim-ilar.) Also, although we have developed a widerrange of applications using the TACC model aspart of a graduate seminar [20], we concentrate onthose applications that enable adaptation to net-work and client variation. These services share thefollowing common characteristics, which make themamenable to implementation on our cluster-basedframework:� Compute-intensive transformation or aggrega-tion� Computation is parallelizable with granularityof a few CPU seconds� Substantial value added by mass customization4.1 TranSend as a TACC ApplicationTranSend is one of the simplest TACC applica-tions we have produced. The dispatch rules sim-ply match the MIME type of the object returnedfrom the origin server to the list of known workers,which (as in all TACC applications) can be updateddynamically. In particular, TranSend does not ex-ploit TACC's ability to compose workers by chain-ing them into a \pipeline" or having one worker callothers as coroutines.Transformed objects are stored in the cache with\fat URL's" that encode a subset of the transforma-tion parameters, saving the work of re-transformingan original should another user ask for the same de-graded version later. Each user can select a desiredlevel of aggressiveness for the lossy compression andturn header sanitizing on and o�.The main di�erence between TranSend and com-mercial products based on its ideas (such as In-tel's QuickWeb [10], currently in market test) isextensibility: adding support for new datatypes toTranSend is as simple as adding a new worker,and composing workers is as simple as modifyingthe dispatch rules (or modifying existing workers9

to hint to the TACC server that they should fallthrough to new workers). In fact, we have gener-alized TranSend into a \lazy �xations" system [22]in which users could select from among a variety ofavailable formats for viewing an object; this was im-plemented by a \graph search" worker that treatedall the transformation workers as edges in a directedgraph and performed a shortest-paths search to de-termine what sequence of workers should be run tosatisfy a particular request.One of the goals of TACC is to exploit modularityand composition to make new services easy to proto-type by reusing existing building blocks. TranSend'sHTML and JPEG workers consist almost entirelyof o�-the-shelf code, and each took an afternoon towrite. A pair of anecdotes illustrates the exibil-ity of the TACC API's in constructing responsiveservices. Our original HTML parser was a fast C-language implementation from the W3C. Debuggingthe pathological cases for this parser was spread outover a period of days|since our prototype TACCserver masks transient faults by bypassing originalcontent \around" the faulting worker, we could onlydeduce the existence of bugs by noticing (on theGraphical Monitor display) that the HTML workerhad been restarted several times over a period ofhours, although the service as a whole was continu-ously available.We later wrote a much slower but more robustparser in Perl to handle proprietary HTML exten-sions such as inline JavaScript. All HTML pagesare initially passed to the slower Perl parser, butif it believes (based on page length and tag den-sity) that processing the page will introduce a de-lay longer than one or two seconds, it immediatelythrows an exception and indicates that the C parsershould take over. Because the majority of long pagestend to be papers published in HTML rather thancomplex pages with weird tags, this scheme exploitsTACC composition and dispatch to handle commoncases well while keeping HTML processing latencybarely noticeable.4.2 Top Gun WingmanTop Gun Wingman is the only graphical Webbrowser available for the USR PalmPilot, a typi-cal \thin client" device. Based on �le downloads,we estimate our current user base at about 8000,all of whom are using the experimental cluster atUC Berkeley. Figure 6 shows a screenshot of thebrowser.Previous attempts to provide graphical Webbrowsing on such small devices have foundered on

Figure 6: Screenshot of the Top Gun Wingmanbrowserthe severe limitations imposed by small screens, lim-ited computing capability, and austere program-ming environments, and virtually all have fallenback to simple text-only browsing. But our adap-tation approach, combined with the composable-workers model provided by TACC, allows us to ap-proach this problem from a di�erent perspective.The core of Top Gun Wingman consists of threeTACC workers: HTML layout, image conversion,and intermediate-form layout to device-speci�c dataformat conversion. These three workers address thethree areas of variation introduced in section 2:� Hardware and software adaptation: We havebuilt TACC workers that output simpli�ed bi-nary markup and scaled-down images ready tobe \spoon fed" to a thin-client device, givenknowledge of the client's screen dimensions, im-age format, and font metrics. This greatly sim-pli�es client-side code since no HTML parsing,layout, or image processing is necessary, andas a side bene�t, the smaller and more e�-cient data representation reduces transmissiontime to the client. The image worker delivers2-bit-per-pixel images, since that is what thePalmPilot hardware supports, and the HTMLparsing and layout worker ensures that no pagedescription larger than about 32KB is deliveredto the client, since that is the approximate heapspace limit imposed by the PalmPilot's pro-gramming environment. We have also addedthree \software upgrades" at the proxy sinceWingman was �rst deployed: a worker that de-10

livers data in AportisDoc [30] format (a pop-ular PalmPilot e-book format), a worker thatextracts and displays the contents of softwarearchives for download directly to the PalmPilot,and an improved image-processing module con-tributed by a senior graphics hacker. In termsof code footprint, Wingman weighs in at 40KBof code (compared with 74KB and 109KB forHandWeb and Palmscape 5.0 respectively, nei-ther of which currently support image viewing).� Network protocol adaptation: In addition todelivering data in a more compact formatand exploiting datatype-speci�c distillation, wehave replaced HTTP with a simpler, datagram-oriented protocol based on Application LevelFraming [9]. The combined e�ect of these op-timizations makes Wingman two to four timesfaster than a desktop browser loading the sameWeb pages over the same bandwidth, andWingman's performance on text-only pages of-ten exceeds that of HTML/HTTP compliantbrowsers on the same platform, especially onslow (< 56 Kb/s) links.4.3 Top Gun MediaBoardTopGun Mediaboard is an electronic sharedwhiteboard application for the PalmPilot. This isa derivation of the the desktop mediaboard appli-cation, which uses SRM (Scalable Reliable Multi-cast) as the underlying communication protocol. Areliable multicast proxy (RMX) TACC worker par-ticipates in the SRM session on behalf of the PDAclients, performing four main types of client adap-tation:� Transport protocol conversion: The PalmPi-lot's network stack does not support IP mul-ticast. The RMX converts the SRM data intoa unicast TCP stream that the client can han-dle.� Application protocol adaptation: To keep theclient implementation simple, all the complex-ities of the mediaboard command protocol arehandled by the RMX. The protocol adaptertransforms the entire sequence of mediaboardcommands into a \pseudo-canvas" by execut-ing each command and storing its result in thecanvas, transmitting only a sequence of simpledraw-ops to the client. The protocol and dataformat for transmitting the draw-ops is a directextension of the Top Gun Wingman datagramprotocol.

� On-demand distillation: The RMX convertsspeci�c data objects according to the client'sneeds. For example, it transforms the GIF andJPEG images that may be placed on the medi-aboard into simpler image representations thatthe PalmPilot can understand, using the sameworker that is part of Wingman. The client ap-plication can re�ne (zoom in on) speci�c por-tions of the canvas.� Intelligent Rate Limiting: Since the proxy hascomplete knowledge of the client's state, theRMX can perform intelligent forwarding ofdata from the mediaboard session to the client.By eliminating redundant draw-ops (for exam-ple, create followed by delete on the same ob-ject) before sending data to the client, the RMXreduces the number of bytes that must be sentover the low-bandwidth link. Moreover, al-though a whiteboard session can consist of anumber of distinct pages, the RMX forwardsonly the data associated with the page cur-rently being viewed on the client.Top Gun Mediaboard is in prealpha use at UCBerkeley, and performs satisfactorily even over slowlinks such as the Metricom Ricochet wireless packetradio modem [11].4.4 Charon: Indirect Authenticationfor Thin ClientsAlthough not yet rewritten as a TACC applica-tion, Charon [23] illustrates a similar use of adap-tation by proxy, for performing indirect authentica-tion. In particular, Charon mediates between thinclients and a Kerberos [38] infrastructure. Charonis necessary because, as we describe in [23], the com-puting resources required for a direct port of Ker-beros to thin clients are forbidding. With Charon,Kerberos can be used both to authenticate clients tothe proxy service, and to authenticate the proxiedclients to Kerberized servers. Charon relieves theclient of a signi�cant amount of Kerberos protocolprocessing, while limiting the amount of trust thatmust be placed in the proxy; in particular, if theproxy is compromised, existing user sessions maybe hijacked but no new sessions can be initiated,since new sessions require cooperation between theclient and proxy. Our Charon prototype client forthe Sony MagicLink [13], a then-popular PDA, hada client footprint of only 45KB, including stack andheap usage.11

5 Related WorkAt the network level, various approaches havebeen used to shield clients from the e�ects of poor(especially wireless) networks [17, 4]. At the appli-cation level, data transformation by proxy interpo-sition has become particularly popular for HTTP,whose proxy mechanism was originally intended forusers behind security �rewalls. The mechanismhas been harnessed for anonymization [7], Kanjitranscoding [36, 40], application-speci�c streamtransformation [6], and personalized \associates"for Web browsing [5, 37]. Some projects providean integrated solution with both network-level andapplication-level mechanisms [18, 19, 42], thoughnone propose a uniform application-developmentmodel analogous to TACC.Rover [15], Coda [31], and Wit [39] di�er intheir respective approaches to partitioning appli-cations between a thin or poorly-connected clientand more powerful server. In particular, Roverand Coda provide explicit support for disconnectedoperation, unlike our TACC work. We �nd thatRover's application-speci�c, toolkit-based approachis a particularly good complement to our own; al-though the TACC model provides a reasonable setof guidelines for thinking about partitioning (leavethe client to do what it does well, and move asmuch as possible of the remaining functionality tothe back end), we are working on integrating Roverinto TACC to provide a rich abstraction for dealingwith disconnection in TACC applications.SmartClients [41] and SWEB++ [3] have ex-ploited the extensibility of client browsers via Javaand JavaScript to enhance scalability of network-based services by dividing labor between the clientand server. We note that our system does not pre-clude, and in fact bene�ts from, exploiting intelli-gence and computational resources at the client; wediscuss various approaches we have tried in [25].6 Lessons and ConclusionsWe proposed three design principles for adapt-ing to network and client variation and deliver-ing a meaningful Internet experience to impover-ished clients: datatype-speci�c distillation and re-�nement, adaptation on demand, and moving com-plexity into the infrastructure. We also o�ered ahigh-level description of the TACC programmingmodel (transformation, aggregation, caching, cus-tomization) that we have evolved for building adap-tive applications, and presented measurements of

our scalable, highly-available, cluster-based TACCserver architecture, focusing on the TranSend webaccelerator application. Finally, we described otherapplications we have built that are in daily use, in-cluding some that push the limits of client adap-tation (such as Top Gun Wingman and Top GunMediaboard). In this section we try to draw somelessons from what we have learned from buildingthese and similar applications and experimentingwith our framework.Aggressively pushing the adaptation-by-proxymodel to its limits, as we have tried to do withTop Gun Wingman and Top Gun Mediaboard, hashelped us validate the proxy-interposition approachfor serving thin clients. Our variation on the themeof application partitioning has been to split the ap-plication between the client and the proxy, ratherthan between the client and the server. This has al-lowed our clients to access existing content with noserver modi�cations. Our guideline for partitioningapplications has been to allow the client to performthose tasks it does well in native code, and moveas much as possible of the remaining work to theproxy. For example, since most thin clients supportsome form of toolkit for building graphical inter-faces, sending HTML markup is too cumbersomefor the client, but sending screen-sized bitmaps isunnecessarily cumbersome for the proxy.A frequent objection raised against our partition-ing approach is that it requires that the proxy ser-vice be available at all times, which is more dif-�cult than simply maintaining the reliability of abank of modems and routers. This observation mo-tivated our work on the cluster-based scalable andhighly-available server platform described in section3, and in fact the TranSend and Wingman proxyservices have been running for several months atUC Berkeley with high stability, except for a two-week period in February 1998 when the cluster wasa�ected by an OS upgrade. Other than one part-time undergraduate assistant, the cluster managesitself, yet thousands of users have come to rely onits stability for using Top Gun Wingman, validat-ing the e�cacy of our cluster platform. This ob-servation, combined with the current trends towardmassive cluster-based applications such as HotBot[14], suggests to us that the adaptive proxy style ofadaptation will be of major importance in servingconvergent \smart phone"-like devices.12

7 AcknowledgmentsThis project has bene�ted from the detailed andperceptive comments of countless anonymous re-viewers, users, and collaborators. Ken Lutz andEric Fraser con�gured and administered the testnetwork on which the TranSend scaling experimentswere performed. Cli� Frost of the UC BerkeleyData Communications and Networks Services groupallowed us to collect traces on the Berkeley di-alup IP network and has worked with us to deployand promote TranSend within UC Berkeley. Un-dergraduate researchers Anthony Polito, BenjaminLing, Andrew Huang, David Lee, and Tim Kimballhelped implement various parts of TranSend andTop Gun Wingman. Ian Goldberg and David Wag-ner helped us debug TranSend, especially throughtheir implementation of the Anonymous Rewebber[27], and Ian implemented major parts of the clientside of Top Gun Wingman, especially the 2-bit-per-pixel hacks. Paul Haeberli of Silicon Graph-ics contributed image processing code for Top GunWingman. Murray Mazer at the Open Group Re-search Institute has provided much useful insight onthe structure of Internet applications and future ex-tensions of this work. We also thank the patientstudents of UCB Computer Science 294{6, InternetServices, Fall 97, for being the �rst real outside de-velopers on our TACC platform and greatly improv-ing the quality of the software and documentation.We have received much valuable feedback fromour UC Berkeley colleagues, especially David Culler,Eric Anderson, Trevor Pering, Hari Balakrishnan,Mark Stemm, and Randy Katz. This research issupported by DARPA contracts #DAAB07-95-C-D154 and #J-FBI-93-153, the California MICROprogram, the UC Berkeley Chancellor's Oppor-tunity Fellowship, the NSERC PGS-A fellowship,Hughes Aircraft Corp., and Metricom Inc.References[1] Elan Amir, Steve McCanne, and Hui Zhang.An application level video gateway. In Proceed-ings ACM Multimedia 1995, 1995. Available athttp://http.cs.berkeley.edu/ elan/articles/pub/vgw.ps.[2] Thomas E. Anderson, David E. Culler, and DavidPatterson. A case for now (networks of worksta-tions). IEEE Micro, 12(1):54{64, Feb 1995.[3] D. Andresen, T. Yang, O. Egecioglu, O. H. Ibarra,and T. R. Smith. Scalability issues for high per-formance digital libraries on the world wide web.In Proceedings of IEEE ADL '96, Forum on Re-

search and Technology Advances in Digital Li-braries, Washington D.C., May 1996.[4] H. Balakrishnan, S. Seshan, E. Amir, and R. Katz.Improving tcp/ip performance over wireless net-works. In Proc. of the 1st ACM Conference on Mo-bile Computing and Networking (MOBICOM 95),Berkeley, CA, November 1995.[5] Rob Barrett, Paul P. Maglio, and Daniel C.Kellem. How to personalize the web, 1996.WBI, developed at IBM Almaden; seehttp://www.raleigh.ibm.com/wbi/wbisoft.htm.[6] Charles Brooks, Murray S. Mazer, Scott Meeks,and Jim Miller. Application-speci�c proxy serversas http stream transducers. In Proceedings of theFourth International World Wide Web Conference,Dec 1995.[7] C2net. Web anonymizer.[8] A. Chankhunthod, P. B. Danzig, C. Neerdaels,M. F. Schwartz, and K. J. Worrell. A hierarchicalinternet object cache. In Proceedings of the 1996Usenix Annual Technical Conference, pages 153{163, January 1996.[9] D.D. Clark and D.L. Tennenhouse. Architecturalconsiderations for a new generation of protocols.Computer Communication Review, 20(4):200{208,Sep 1990.[10] Intel Corp. Quickweb web accelerator.[11] Metricom Corp. Ricochet wireless modem.[12] Nokia Corp. and Geoworks.Nokia 9000 communicator."http://www.geoworks.com/htmpages/9000.htm".[13] Sony Corp. Sony magiclink pda."http://www.sel.sony.com/SEL/Magic/".[14] Inktomi Corporation. The hotbot search engine.[15] Anthony D. Joseph et al. Rover: A toolkit for mo-bile information access. In Proceedings of the 15thACM Symposium on Operating Systems Principles,Copper Mountain Resort, CO, USA, Dec 1995.[16] Armando Fox et al. TranSend web acceleratorproxy. Free service deployed by UC Berkeley.[17] H. Balakrishnan et al. A comparison of mechanismsfor improving tcp performance over wireless links.In Proc. ACM SIGCOMM 96, Stanford, CA, Au-gust 1996.[18] M. Liljeberg et al. Enhanced services for world wideweb in mobile wan environment. Technical ReportC-1996-28, University of Helsinki CS Department,April 1996.[19] WAP Forum. Wireless application protocol (WAP)forum. http://www.wapforum.org.[20] Armando Fox and Eric A. Brewer. CS 294-6: Internet services, class proceedings, fall 1997.http://www.cs.berkeley.edu/ fox/cs294.13

[21] Armando Fox and Eric A. Brewer. Reducing wwwlatency and bandwidth requirements via real-timedistillation. In Proceedings of the Fifth Interna-tional World Wide Web Conference, Paris, France,May 1996. World Wide Web Consortium.[22] Armando Fox and Steven D. Gribble. DOLF:Digital objects with lazy �xations. Unpublishedmanuscript: CS 294-5 Digital Libraries Seminar,Spring 1996.[23] Armando Fox and Steven D. Gribble. Security onthe move: Indirect authentication using Kerberos.In Proc. Second International Conference on Wire-less Networking and Mobile Computing (MobiCom'96), Rye, NY, November 1996.[24] Armando Fox, Steven D. Gribble, Yatin Chawathe,Eric A. Brewer, and Paul Gauthier. Cluster-basedscalable network services. In Proceedings of the 16thACM Symposium on Operating Systems Principles,St.-Malo, France, October 1997.[25] Armando Fox, Steven D. Gribble, Yatin Chawathe,Anthony Polito, Benjamin Ling, Andrew C. Huang,and Eric A. Brewer. Orthogonal extensions to theWWW user interface using client-side technologies.In User Interface Software and Technology (UIST)97, Ban�, Canada, October 1997.[26] Graphics interchange format version 89a (GIF).CompuServe Incorporated, Columbus, Ohio, July1990.[27] Ian Goldberg and David Wagner. Tazservers and the rewebber network: Enablinganonymous publishing on the world wideweb. Unpublished manuscript available athttp://www.cs.berkeley.edu/ daw/cs268/, May1997.[28] Steven D. Gribble and Eric A. Brewer. System de-sign issues for internet middleware services: De-ductions from a large client trace. In Proceedingsof the 1997 USENIX Symposium on Internet Tech-nologies and Systems, Monterey, California, USA,December 1997.[29] Tom R. Halfhill. Inside the web pc. Byte Magazine,pages 44{56, March 1996.[30] Aportis Inc. Aportisdoc overview.[31] James J. Kistler and M. Satyanarayanan. Dis-connected Operation in the Coda File System.ACM Transactions on Computer Systems, 10:3{25,February 1992.[32] Myricom. Myrinet: A gigabit per second local areanetwork. In IEEE Micro, February 1995.[33] John K. Ousterhout. Tcl and the Tk Toolkit.Addison-Wesley, 1994.[34] DEC SRC Personal Communications,Paul MacJones. Postscipt to text converter.

[35] Jef Poskanzer. Netpbm release 7.ftp://wuarchive.wustl.edu/graphics/graphics/packages/NetPBM,1993.[36] Y. Sato. Delegate server.[37] M.A. Schickler, M.S. Mazer, and C. Brooks. Pan-browser support for annotations and other meta-information on the world wide web. In Proc.Fifth International World Wide Web Conference(WWW-5), May 1996.[38] Jennifer G. Steiner, Cli�ord Neuman, and Je�rey I.Schiller. Kerberos: An authentication service foropen network systems. In Proceedings USENIXWinter Conference 1988, pages 191{202, Dallas,Texas, USA, February 1988.[39] Terri Watson. Application design for wireless com-puting. In Mobile Computing Systems and Appli-cations Workshop, August 1994.[40] Kao-Ping Yee. Shoduoka mediator service.[41] C. Yoshikawa, B. Chun, P. Eastham, A. Vahdat,T. Anderson, and D. Culler. Using smart clientsto build scalable services. In Proc. Winter 1997USENIX Technical Conference, January 1997.[42] Bruce Zenel. A proxy based �ltering mechanismfor the mobile environment. Thesis Proposal, Mar1996.

14

Documents

Adapting to Network and Client Variation Using Active Proxies