68
CompSci 100e 9.1 Data Compression Compression is a high-profile application .zip, .mp3, .aac, .jpg, .gif, .gz, .mpg, … What property of MP3 was a significant factor in what made P2P file sharing work? Why do we care? Secondary storage capacity doubles every year Disk space fills up quickly on every computer system More data to compress than ever before Data compression facilitated by priority queue All-time best assignment in a Compsci 100(e) course? Subject to debate, of course

Data Compression

Embed Size (px)

DESCRIPTION

Data Compression. Compression is a high-profile application .zip, .mp3, .aac, .jpg, .gif, .gz, .mpg, … What property of MP3 was a significant factor in what made P2P file sharing work? Why do we care? Secondary storage capacity doubles every year - PowerPoint PPT Presentation

Citation preview

Page 1: Data Compression

CompSci 100e 9.1

Data Compression Compression is a high-profile application

.zip, .mp3, .aac, .jpg, .gif, .gz, .mpg, … What property of MP3 was a significant factor in

what made P2P file sharing work? Why do we care?

Secondary storage capacity doubles every year Disk space fills up quickly on every computer

system More data to compress than ever before

Data compression facilitated by priority queue All-time best assignment in a Compsci 100(e)

course?• Subject to debate, of course

Page 2: Data Compression

CompSci 100e 9.2

More on Compression What’s the difference between compression

techniques? .mp3 files and .zip files? .gif and .jpg? Lossless and lossy

Is it possible to compress (lossless) every file? Why?

Lossy methods Good for pictures, video, and audio (JPEG,

MPEG, etc.) Lossless methods

Run-length encoding, Huffman, LZW, …

Page 3: Data Compression

CompSci 100e 9.3

Text Compression

Input: String S Output: String S

Shorter S can be reconstructed from S

Page 4: Data Compression

CompSci 100e 9.4

Text Compression: Examples

Symbol

ASCII Fixedlength

Var.length

a 01100001

000 000

b 01100010

001 11

c 01100011

010 01

d 01100100

011 001

e 01100101

100 10

“abcde” in the different formatsASCII: 01100001011000100110001101100100…Fixed: 000001010011100

Var: 000110100110

0

00

0

0

00

1

1 1

1

a b c d e

a d

bc e0

0

0

0 1

1

1

1

EncodingsASCII: 8 bits/characterUnicode: 16 bits/character

Page 5: Data Compression

CompSci 100e 9.5

Huffman coding: go go gophers

Encoding uses tree: 0 left/1 right How many bits? 37!! Savings? Worth it?

ASCII 3 bits Huffmang 103 1100111 000 00o 111 1101111 001 01p 112 1110000 010 1100h 104 1101000 011 1101e 101 1100101 100 1110r 114 1110010 101 1111s 115 1110011 110 101sp. 32 1000000 111 101

3

s1

*2

2

p1

h1

2

e1

r1

4g

3

o

3

6

3

2

p1

h1

2

e1

r1

4

s1

*2

7

g3

o3

6

13

Page 6: Data Compression

CompSci 100e 9.6

Huffman Coding D.A Huffman in early 1950’s Before compressing data, analyze the input stream Represent data using variable length codes Variable length codes though Prefix codes

Each letter is assigned a codeword Codeword is for a given letter is produced by

traversing the Huffman tree Property: No codeword produced is the prefix of

another Letters appearing frequently have short

codewords, while those that appear rarely have longer ones

Huffman coding is optimal per-character coding method

Page 7: Data Compression

CompSci 100e 9.7

Building a Huffman tree Begin with a forest of single-node trees (leaves)

Each node/tree/leaf is weighted with character count

Node stores two values: character and count There are n nodes in forest, n is size of alphabet?

Repeat until there is only one node left: root of tree Remove two minimally weighted trees from forest Create new tree with minimal trees as children,

• New tree root's weight: sum of children (character ignored)

Does this process terminate? How do we get minimal trees? Remove minimal trees, hummm……

Page 8: Data Compression

CompSci 100e 9.8

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

C

1

F

1

P

2

U

2

R

2

L

2

D

2

G

3

T

3

O

3

B

3

A

4

M

4

S

Page 9: Data Compression

CompSci 100e 9.9

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

C

1

F

1

P

2

U

2

R

2

L

2

D

2

G

3

T

3

O

3

B

3

A

4

M

4

S

2

2

Page 10: Data Compression

CompSci 100e 9.10

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

T

3

O

3

B

3

A

4

M

4

S

2

2

3

3

Page 11: Data Compression

CompSci 100e 9.11

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

T

3

O

3

B

3

A

4

M

4

S

2

2

3

3

4

4

Page 12: Data Compression

CompSci 100e 9.12

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

T

3

O

3

B

3

A

4

M

4

S

2

2

3

3

4

4

4

4

Page 13: Data Compression

CompSci 100e 9.13

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S

2

3

3

4

4

4

4

5

5

Page 14: Data Compression

CompSci 100e 9.14

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S

2 3

4

4

4

4

5

5

6

6

Page 15: Data Compression

CompSci 100e 9.15

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S

2 3

4

4

4

4

5

5

6

6

6

6

Page 16: Data Compression

CompSci 100e 9.16

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

4

4

4

4

5

5

6

6

6

8

8

6

Page 17: Data Compression

CompSci 100e 9.17

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

E

5

N

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

44

5

5

6

6

6

8

8

6

8

8

Page 18: Data Compression

CompSci 100e 9.18

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

44

5

5

6

6

6

8

8

6

8

8

10

10

Page 19: Data Compression

CompSci 100e 9.19

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11 6

I

5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

6

6

8

8

6

8

8

10

10

11

11

Page 20: Data Compression

CompSci 100e 9.20

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

8

6

8

8

10

10

11

11

12

12

Page 21: Data Compression

CompSci 100e 9.21

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

10

11

11

12

1216

Page 22: Data Compression

CompSci 100e 9.22

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

121621

Page 23: Data Compression

CompSci 100e 9.23

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

23

162123

Page 24: Data Compression

CompSci 100e 9.24

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

2337

Page 25: Data Compression

CompSci 100e 9.25

Building a tree

“A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

60

Page 26: Data Compression

CompSci 100e 9.26

Encoding1. Count occurrence of all occurring character

O( )2. Build priority queue O(

)

3. Build Huffman tree O( )

4. Create Table of codes from tree O( )

5. Write Huffman tree and coded data to file O( )

Page 27: Data Compression

CompSci 100e 9.27

Properties of Huffman coding Want to minimize weighted path length L(T)of

tree T

wi is the weight or count of each codeword i

di is the leaf corresponding to codeword i How do we calculate character (codeword)

frequencies? Huffman coding creates pretty full bushy trees?

When would it produce a “bad” tree? How do we produce coded compressed data

from input efficiently?

∑∈

=)(

)(TLeafi

iiwdTL

Page 28: Data Compression

CompSci 100e 9.28

Writing code out to file How do we go from characters to encodings?

Build Huffman tree Root-to-leaf path generates encoding

Need way of writing bits out to file Platform dependent? Complicated to write bits and read in same

ordering

See BitInputStream and BitOutputStream classes Depend on each other, bit ordering preserved

How do we know bits come from compressed file? Store a magic number

Page 29: Data Compression

CompSci 100e 9.29

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

01100000100001001101

Page 30: Data Compression

CompSci 100e 9.30

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

1100000100001001101

Page 31: Data Compression

CompSci 100e 9.31

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

100000100001001101

Page 32: Data Compression

CompSci 100e 9.32

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

00000100001001101

Page 33: Data Compression

CompSci 100e 9.33

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

0000100001001101

G

Page 34: Data Compression

CompSci 100e 9.34

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

000100001001101

G

Page 35: Data Compression

CompSci 100e 9.35

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

00100001001101

G

Page 36: Data Compression

CompSci 100e 9.36

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

0100001001101

G

Page 37: Data Compression

CompSci 100e 9.37

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

100001001101

G

Page 38: Data Compression

CompSci 100e 9.38

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

00001001101

GO

Page 39: Data Compression

CompSci 100e 9.39

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

0001001101

GO

Page 40: Data Compression

CompSci 100e 9.40

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

001001101

GO

Page 41: Data Compression

CompSci 100e 9.41

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

01001101

GO

Page 42: Data Compression

CompSci 100e 9.42

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

1001101

GO

Page 43: Data Compression

CompSci 100e 9.43

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

001101

GOO

Page 44: Data Compression

CompSci 100e 9.44

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

01101

GOO

Page 45: Data Compression

CompSci 100e 9.45

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

1101

GOO

Page 46: Data Compression

CompSci 100e 9.46

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

101

GOO

Page 47: Data Compression

CompSci 100e 9.47

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

01

GOO

Page 48: Data Compression

CompSci 100e 9.48

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

1

GOOD

Page 49: Data Compression

CompSci 100e 9.49

Decoding a message

11

6

I5

N

5

E

1

F

1

C

1

P

2

U

2

R

2

L

2

D

2

G

3

O

3

T

3

B

3

A

4

M

4

S2 3

445

68

6

8

16

10

21

11

12

2337

60

01100000100001001101

GOOD

Page 50: Data Compression

CompSci 100e 9.50

Decoding

1. Read in tree data O( )

2. Decode bit string with tree O( )

Page 51: Data Compression

CompSci 100e 9.51

Huffman Tree 2 “A SIMPLE STRING TO BE ENCODED USING A

MINIMAL NUMBER OF BITS” E.g. “ A SIMPLE”

“10101101001000101001110011100000”

Page 52: Data Compression

CompSci 100e 9.52

Huffman Tree 2 “A SIMPLE STRING TO BE ENCODED USING A

MINIMAL NUMBER OF BITS” E.g. “ A SIMPLE”

“10101101001000101001110011100000”

Page 53: Data Compression

CompSci 100e 9.53

Huffman Tree 2 “A SIMPLE STRING TO BE ENCODED USING A

MINIMAL NUMBER OF BITS” E.g. “ A SIMPLE”

“10101101001000101001110011100000”

Page 54: Data Compression

CompSci 100e 9.54

Huffman Tree 2 “A SIMPLE STRING TO BE ENCODED USING A

MINIMAL NUMBER OF BITS” E.g. “ A SIMPLE”

“10101101001000101001110011100000”

Page 55: Data Compression

CompSci 100e 9.55

Huffman Tree 2 “A SIMPLE STRING TO BE ENCODED USING A

MINIMAL NUMBER OF BITS” E.g. “ A SIMPLE”

“10101101001000101001110011100000”

Page 56: Data Compression

CompSci 100e 9.56

Huffman Tree 2 “A SIMPLE STRING TO BE ENCODED USING A

MINIMAL NUMBER OF BITS” E.g. “ A SIMPLE”

“10101101001000101001110011100000”

Page 57: Data Compression

CompSci 100e 9.57

Huffman Tree 2 “A SIMPLE STRING TO BE ENCODED USING A

MINIMAL NUMBER OF BITS” E.g. “ A SIMPLE”

“10101101001000101001110011100000”

Page 58: Data Compression

CompSci 100e 9.58

Huffman Tree 2 “A SIMPLE STRING TO BE ENCODED USING A

MINIMAL NUMBER OF BITS” E.g. “ A SIMPLE”

“10101101001000101001110011100000”

Page 59: Data Compression

CompSci 100e 9.59

Huffman Tree 2 “A SIMPLE STRING TO BE ENCODED USING A

MINIMAL NUMBER OF BITS” E.g. “ A SIMPLE”

“10101101001000101001110011100000”

Page 60: Data Compression

CompSci 100e 9.60

Building a Huffman tree Begin with a forest of single-node trees/tries (leaves)

Each node/tree/leaf is weighted with character count

Node stores two values: character and count

Repeat until there is only one node left: root of tree Remove two minimally weighted trees from forest Create new tree/internal node with minimal trees

as children,• Weight is sum of children’s weight (no char)

How does process terminate? Finding minimum? Remove minimal trees, hummm……

Page 61: Data Compression

CompSci 100e 9.61

How do we create Huffman Tree/Trie? Insert weighted values into priority queue

What are initial weights? Why?

Remove minimal nodes, weight by sums, re-insert Total number of nodes?

PriorityQueue<TreeNode> pq = new PriorityQueue<TreeNode>(); for(int k=0; k < freq.length; k++){ pq.add(new TreeNode(k,freq[k],null,null)); } while (pq.size() > 1){ TreeNode left = pq.remove(); TreeNode right = pq.remove(); pq.add(new TreeNode(0,left.weight+right.weight, left,right)); } TreeNode root = pq.remove();

Page 62: Data Compression

CompSci 100e 9.62

Creating compressed file Once we have new encodings, read every character

Write encoding, not the character, to compressed file

Why does this save bits? What other information needed in compressed

file?

How do we uncompress? How do we know foo.hf represents compressed

file? Is suffix sufficient? Alternatives?

Why is Huffman coding a two-pass method? Alternatives?

Page 63: Data Compression

CompSci 100e 9.63

Uncompression with Huffman We need the trie to uncompress

000100100010011001101111 As we read a bit, what do we do?

Go left on 0, go right on 1 When do we stop? What to do?

How do we get the trie? How did we get it originally? Store 256

int/counts• How do we read counts?

How do we store a trie? 20 Questions relevance?• Reading a trie? Leaf indicator? Node values?

Page 64: Data Compression

CompSci 100e 9.64

Other Huffman Issues What do we need to decode?

How did we encode? How will we decode? What information needed for decoding?

Reading and writing bits: chunks and stopping Can you write 3 bits? Why not? Why? PSEUDO_EOF BitInputStream and BitOutputStream: API

What should happen when the file won’t compress? Silently compress bigger? Warn user?

Alternatives?

Page 65: Data Compression

CompSci 100e 9.65

Huffman Complexities How do we measure? Size of input file, size of

alphabet Which is typically bigger?

Accumulating character counts: ______ How can we do this in O(1) time, though not

really Building the heap/priority queue from counts ____

Initializing heap guaranteed Building Huffman tree ____

Why? Create table of encodings from tree ____

Why? Write tree and compressed file _____

Page 66: Data Compression

CompSci 100e 9.66

Good Compsci 100(e) Assignment? Array of character/chunk counts, or is this a map?

Map character/chunk to count, why array? Priority Queue for generating tree/trie

Do we need a heap implementation? Why? Tree traversals for code generation, uncompression

One recursive, one not, why and which? Deal with bits and chunks rather than ints and chars

The good, the bad, the ugly Create a working compression program

How would we deploy it? Make it better? Benchmark for analysis

What’s a corpus?

Page 67: Data Compression

CompSci 100e 9.67

Other methods Adaptive Huffman coding Lempel-Ziv algorithms

Build the coding table on the fly while reading document

Coding table changes dynamically Protocol between encoder and decoder so

that everyone is always using the right coding scheme

Works well in practice (compress, gzip, etc.) More complicated methods

Burrows-Wheeler (bunzip2) PPM statistical methods

Page 68: Data Compression

CompSci 100e 9.68

Data CompressionYear Scheme Bit/

Char

1967

ASCII 7.00

1950

Huffman 4.70

1977

Lempel-Ziv (LZ77) 3.94

1984

Lempel-Ziv-Welch (LZW) – Unix compress

3.32

1987

(LZH) used by zip and unzip 3.30

1987

Move-to-front 3.24

1987

gzip 2.71

1995

Burrows-Wheeler 2.29

1997

BOA (statistical data compression)

1.99

Why is data compression important?

How well can you compress files losslessly? Is there a limit? How to

compare? How do you

measure how much information?