JSZap : Compressing JavaScript Code

  • View

  • Download

Embed Size (px)


JSZap : Compressing JavaScript Code. Martin Burtscher , UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha , IIT Kanpur. A Web 2.0 Application Dissected. 1+ MB code. Talks to 14 backend services (traffic, images, directions, ads, ). - PowerPoint PPT Presentation


JSZap Compressing JavaScript

JSZap: Compressing JavaScript CodeMartin Burtscher, UT AustinBen Livshits & Ben Zorn, Microsoft ResearchGaurav Sinha, IIT Kanpur

Introduction Thank-Today Ill be talking about compressing Javascript using an AST based technique.Me make several optimizations in our strategy and then compare it with the existing technique.The results we get are very promising and encouraging.


A Web 2.0 Application Dissected70,000+ lines of JavaScript code downloaded2,855 Functions1+ MB codeTalks to 14 backend services(traffic, images,directions, ads, )2Update the map2Lots of JavaScript being Transmitted3Up to 85% of a Web 2.0 app is JavaScript code!Human body is mostly water3AJAX: Tension Headaches4

JavaScript on the WireJavaScriptcrunch

gzip -dparserASTJSZap

gzip5JSZap ApproachRepresent JavaScript as AST instead of sourceSerialize the compressed ASTDecompress directly into AST on clientUse gzip as 2nd-level (de-)compressor

6Benefits of AST-based Compression7Major benefits we envision are-Reduce bandwidthParsing delay is reduced at client and latency is removed in the browserAlso code is well formed

7JSZap CompressionJavaScriptJSZap

gzip8JSZap CompressionJavaScriptidentifiers


GZIP is a formidable opponentJSZap vs. GZIP11Talk Outlineidentifiersliteralsproductions123evaluation on real code12Background: ASTsa * b + c1) E E + T2) E T3) T T * F 4) T F5) F id+*abc5513513ExpressionGrammarTreeA Simple Javascript Examplevar y = 2;function foo () {var x = "jscrunch";var z = 3;z = y + y;}x = "jszap";y foo x z z y y xLiteral Stream"jscrunch" 2 3 "jszap"14Production Stream134...134 ...For the small javascript input above one can look three streams i.e. Productions,Identifiers and literalsThese have all the information that was there in the code and can be used to recreate it.We exploit this dual representation of code in our approach.Quick observations:Redundancy in streams- repeated ids and literals-structure in productions

14Benchmarking JSZapBenchmark nameSource linesSource bytesgmonkey92217,382getDOMHash1,13625,467bing13,75877,891bingmap13,47380,066livemsg15,30793,982bingmap29,726113,393facebook15,886141,469livemsg27,139156,282officelive122,016668,051JavaScript files up to 22K LOC

Variety of app types

Both hand-generated, and machine-generated

gzipped everything

15Components of JavaScript Source16None of the categories can be ignored

Identifiers become more prominent with code growthPrior work: doesnt focus on source language compressionDidnt have identifiers like JS does16Compressing the Production StreamFrequency-based production renaming

Differential encoding: 26 and 57 => 2 and 3

Chain rule: eliminate predictable productions

Tree-based prediction-by-partial-match17PPMCConsider compressing if (P) then X else XShould be very compressibleif (P) then ...abc... else ...abc...

18PXXTree context used to build a predictor

Provides the next likely child node given context C and child position p

Arithmetic coding: more likely=shorter IDs

See paper for detailsProduction Compression with PPMC19Compressing the Identifier StreamSymbol tables instead of identifier stream:Compress redundancy: offset into tableGlobal or local symbol tablesUse variable-length encoding

Other techniques:Sort symbols by frequencyRename local variables

20Variable-length Encoding for Identifiers21Variable-Length Identifier Encoding22Symbol Tables: Effectiveness23Compressing LiteralsSymbol tablesGrouping literals by typePre-fixes and post-fixesThese techniques result in 5-10% savings compared to gzip24Get about 5% on literals, skipping, see paper24Average JSZap Compression: 10%25Summary and ConclusionsJSZap: AST-based compression for JavaScript

Propose a range of techniques for compressingProductionsIdentifiersLiterals

Preliminary results are encouraging: 10% savings over gzip

Future focusLatency measurements Browser integration

26Well-formednessSecurity (AdSafe)AST representationUnblocking HTML parserCaching and incremental updatesCompression with JSZap27?Questions?References[1]Exploring Last n value Prediction Martin Burtscher and Benjamin G. Zorn.

[2]The Gzip Algorithm - http://www.gzip.org/algorithm.txt

[3]FPC: A High-Speed Compressor for Double-Precision Floating-Point Data Martin Burtscher and Paruj Ratanaworabhan.28