27
JSZap: Compressing JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha, IIT Kanpur

JSZap : Compressing JavaScript Code

  • Upload
    vidor

  • View
    97

  • Download
    1

Embed Size (px)

DESCRIPTION

JSZap : Compressing JavaScript Code. Martin Burtscher , UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha , IIT Kanpur. A Web 2.0 Application Dissected. 1+ MB code. Talks to 14 backend services (traffic, images, directions, ads, …). - PowerPoint PPT Presentation

Citation preview

Page 1: JSZap :  Compressing  JavaScript  Code

JSZap: Compressing JavaScript Code

Martin Burtscher, UT AustinBen Livshits & Ben Zorn, Microsoft Research

Gaurav Sinha, IIT Kanpur

Page 2: JSZap :  Compressing  JavaScript  Code

2

A Web 2.0 Application Dissected

70,000+ lines of JavaScript code

downloaded2,855 Functions

1+ MB codeTalks to 14 backend

services(traffic, images,

directions, ads, …)

Page 3: JSZap :  Compressing  JavaScript  Code

3

Lots of JavaScript being Transmitted

www.live.com

spreadsheets.google

maps.live

chi.lexigame

hotmail

gmail

dropthings

maps.google

pageflakes

bunny hunt

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Fraction of download that is JavaScript

Up to 85% of a Web 2.0

app is JavaScript code!

Page 4: JSZap :  Compressing  JavaScript  Code

AJAX: Tension Headaches

4

Execution can’t start without

the code

Move code to client for

responsiveness

Page 5: JSZap :  Compressing  JavaScript  Code

5

JavaScript on the Wire

JavaScript crunch

gzip -d parser AST

JSZap

gzip

Page 6: JSZap :  Compressing  JavaScript  Code

6

JSZap Approach

• Represent JavaScript as AST instead of source

• Serialize the compressed AST

• Decompress directly into AST on client

• Use gzip as 2nd-level (de-)compressor

Page 7: JSZap :  Compressing  JavaScript  Code

7

Benefits of AST-based Compression

• Compression: less to transmit• ASTs are blasted directly into the browser

Reduced Latency

• Reduces mobile charges• Reduces operator network costs: better for servers

Reduced Network Bandwidth

• Ensures well-formedness of code• Can use to check language subsets easily (AdSafe)• Caching incremental updates• Unblocking HTML parser

Correctness, Security, and other Benefits

Page 8: JSZap :  Compressing  JavaScript  Code

8

JSZap Compression

JavaScript JSZap gzip

Page 9: JSZap :  Compressing  JavaScript  Code

9

JSZap Compression

JavaScript identifiers gzip

literals

productions1

2

3

Page 10: JSZap :  Compressing  JavaScript  Code

10

GZIP is a formidable

opponent

Page 11: JSZap :  Compressing  JavaScript  Code

11

JSZap vs. GZIP

JSZapgzip0

5

10

15

20

25

30

35

40

5.45.4

18.419.0

8.411.5

Literals Identifiers Productions

Size

in K

B

Page 12: JSZap :  Compressing  JavaScript  Code

12

Talk Outline

identifiers

literals

productions1

2

3

evaluation on real code

Page 13: JSZap :  Compressing  JavaScript  Code

13

Background: ASTs

a * b + c 1) E E + T

2) E T3) T T * F

4) T F5) F id

+

*

a b

c5

5

1

3

5

Expression Grammar Tree

Page 14: JSZap :  Compressing  JavaScript  Code

14

A Simple Javascript Examplevar y = 2;function foo () {

var x = "jscrunch";var z = 3;z = y + y;

}x = "jszap";

Identifier Stream

y foo x z z y y x

Literal Stream

"jscrunch" 2 3 "jszap"

Production Stream

1 3 4 ... 1 3 4 ...

Page 15: JSZap :  Compressing  JavaScript  Code

15

Benchmarking JSZap

Benchmark name Source lines

Source bytes

gmonkey 922 17,382getDOMHash 1,136 25,467bing1 3,758 77,891bingmap1 3,473 80,066livemsg1 5,307 93,982bingmap2 9,726 113,393facebook1 5,886 141,469livemsg2 7,139 156,282officelive1 22,016 668,051

• JavaScript files up to 22K LOC

• Variety of app types

• Both hand-generated, and machine-generated

• gzipped everything

Page 16: JSZap :  Compressing  JavaScript  Code

16

Components of JavaScript Sourcegm

onke

y

getD

OM

Hash

bing

1

bing

map

1

livem

sg1

bing

map

2

face

book

1

livem

sg2

office

live1

0%10%20%30%40%50%60%70%80%90%

100%

productions identifiers literals

• None of the categories can be ignored

• Identifiers become more prominent with code growth

Page 17: JSZap :  Compressing  JavaScript  Code

17

Compressing the Production Stream

• Frequency-based production renaming

• Differential encoding: 26 and 57 => 2 and 3

• Chain rule: eliminate predictable productions

• Tree-based prediction-by-partial-match

Page 18: JSZap :  Compressing  JavaScript  Code

18

PPMC

• Consider compressing – if (P) then X else X

• Should be very compressible• if (P) then ...abc... else ...abc...

P

XX

• Tree context used to build a predictor

• Provides the next likely child node given context C and child position p

• Arithmetic coding: more likely=shorter IDs

• See paper for details

Page 19: JSZap :  Compressing  JavaScript  Code

19

Production Compression with PPMC

gmon

key

getD

OM

Hash

bing

1

bing

map

1

livem

sg1

bing

map

2

face

book

1

livem

sg2

office

live1

50%55%60%65%70%75%80%85%90%95%

100%

0.6772

Prod

uctio

n Co

mpr

essi

on (g

zip

= 1)

Page 20: JSZap :  Compressing  JavaScript  Code

20

Compressing the Identifier Stream

• Symbol tables instead of identifier stream:– Compress redundancy: offset into table– Global or local symbol tables– Use variable-length encoding

• Other techniques:– Sort symbols by frequency– Rename local variables

Page 21: JSZap :  Compressing  JavaScript  Code

21

Variable-length Encoding for Identifiers

is global?

is renamed local

00…

01…

fits in 1 byte?

11…

10…

Page 22: JSZap :  Compressing  JavaScript  Code

22

Variable-Length Identifier Encodinggm

onke

y

getD

OM

Hash

bing

1

bing

map

1

livem

sg1

bing

map

2

face

book

1

livem

sg2

office

live1

0%10%20%30%40%50%60%70%80%90%

100%

parent local 2byte local 1byte local builtin global 2byte global 1byte

Page 23: JSZap :  Compressing  JavaScript  Code

23

Symbol Tables: Effectiveness

gmon

key

getD

OM

Hash

bing

1

bing

map

1

livem

sg1

bing

map

2

face

book

1

livem

sg2

office

live1

80%

85%

90%

95%

100%

0.943

89%

Global ST VarEnc

Iden

tifier

s (N

oST

= 1)

Page 24: JSZap :  Compressing  JavaScript  Code

24

Compressing Literals

• Symbol tables• Grouping literals by type• Pre-fixes and post-fixes• These techniques result in 5-10% savings

compared to gzip

Page 25: JSZap :  Compressing  JavaScript  Code

25

Average JSZap Compression: 10%

gmon

key

getD

OM

Hash

bing

1

bing

map

1

livem

sg1

bing

map

2

face

book

1

livem

sg2

office

live1

80%82%84%86%88%90%92%94%96%98%

100%

0.8792

JSZa

p Co

mpr

essi

on (g

zip

= 1)

Productions; 26%

Identifiers; 57%

Literals; 17%

13% savings

Page 26: JSZap :  Compressing  JavaScript  Code

26

Summary and Conclusions• JSZap: AST-based compression for JavaScript

• Propose a range of techniques for compressing– Productions– Identifiers– Literals

• Preliminary results are encouraging: 10% savings over gzip

• Future focus– Latency measurements – Browser integration

Page 27: JSZap :  Compressing  JavaScript  Code

27

Well-formedness

Security (AdSafe)

AST representation

Unblocking HTML parser

Caching and incremental

updates

Compression with JSZap

?

Questions?