112
61A Extra Lecture 4 Thursday, February 19

61A Extra Lecture 4

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

61A Extra Lecture 4

Thursday, February 19

0100010101101110011000110110111101100100011010010110111001100111

0100010101101110011000110110111101100100011010010110111001100111(Encoding)

What’s the point?

3http://pixshark.com/confused-face-clip-art.htm

What’s the point?

• Why do we encode things?

3http://pixshark.com/confused-face-clip-art.htm

What’s the point?

• Why do we encode things?

• You don’t speak binary

3http://pixshark.com/confused-face-clip-art.htm

What’s the point?

• Why do we encode things?

• You don’t speak binary

• Computers don’t speak English

3http://pixshark.com/confused-face-clip-art.htm

What’s the point?

• Why do we encode things?

• You don’t speak binary

• Computers don’t speak English

3http://pixshark.com/confused-face-clip-art.htm

A First Attempt

4

A First Attempt

• Let’s use an encoding

4

A First Attempt

• Let’s use an encoding

4

Letter Binary Letter Binarya 0 n 1b 1 o 0c 0 p 1d 1 q 1e 1 r 0f 0 s 1g 0 t 0h 1 u 0i 1 v 1j 1 w 1k 0 x 1l 1 y 0

m 1 z 0

Analysis

5

Analysis

Pros

5

Analysis

Pros

• Encoding was easy

5

Analysis

Pros

• Encoding was easy

• Took a very small amount of space

5

Analysis

Pros

• Encoding was easy

• Took a very small amount of space

Cons

5

Analysis

Pros

• Encoding was easy

• Took a very small amount of space

Cons

• Decoding it was impossible

5

Decoding

6

Decoding

• Encoding by itself is useless

6

Decoding

• Encoding by itself is useless

• Decoding is also necessary

6

Decoding

• Encoding by itself is useless

• Decoding is also necessary

• So… we need more bits

6

Decoding

• Encoding by itself is useless

• Decoding is also necessary

• So… we need more bits

• How many bits do we need?

6

Decoding

• Encoding by itself is useless

• Decoding is also necessary

• So… we need more bits

• How many bits do we need?

• lowercase alphabet

6

Decoding

• Encoding by itself is useless

• Decoding is also necessary

• So… we need more bits

• How many bits do we need?

• lowercase alphabet

• 5 bits

6

A Second Attempt

7

A Second Attempt

• Let’s try another encoding

7

A Second Attempt

• Let’s try another encoding

7

Letter Binary Letter Binarya 00000 n 01101b 00001 o 01110c 00010 p 01111d 00011 q 10000e 00100 r 10001f 00101 s 10010g 00110 t 10011h 00111 u 10100i 01000 v 10101j 01001 w 10110k 01010 x 10111l 01011 y 11000

m 01100 z 11001

Analysis

8

Analysis

Pros

8

Analysis

Pros

• Encoding was easy

8

Analysis

Pros

• Encoding was easy

• Decoding was possible

8

Analysis

Pros

• Encoding was easy

• Decoding was possible

Cons

8

Analysis

Pros

• Encoding was easy

• Decoding was possible

Cons

• Takes more space…

8

Analysis

Pros

• Encoding was easy

• Decoding was possible

Cons

• Takes more space…

• What restriction did we place that’s unnecessary?

8

Analysis

Pros

• Encoding was easy

• Decoding was possible

Cons

• Takes more space…

• What restriction did we place that’s unnecessary?

• Fixed length

8

Variable Length Encoding

9

Variable Length Encoding

• Problems?

9

Variable Length Encoding

• Problems?

• When do we start and stop?

9

Variable Length Encoding

• Problems?

• When do we start and stop?

• String of As and Bs: ABA

9

Variable Length Encoding

• Problems?

• When do we start and stop?

• String of As and Bs: ABA

• A - 00, B - 0

9

Variable Length Encoding

• Problems?

• When do we start and stop?

• String of As and Bs: ABA

• A - 00, B - 0

• Encode ABA: 00000

9

Variable Length Encoding

• Problems?

• When do we start and stop?

• String of As and Bs: ABA

• A - 00, B - 0

• Encode ABA: 00000

• Decode 00000:

9

Variable Length Encoding

• Problems?

• When do we start and stop?

• String of As and Bs: ABA

• A - 00, B - 0

• Encode ABA: 00000

• Decode 00000:

• ABA, AAB, BAA?

9

Variable Length Encoding

• Problems?

• When do we start and stop?

• String of As and Bs: ABA

• A - 00, B - 0

• Encode ABA: 00000

• Decode 00000:

• ABA, AAB, BAA?

• What lengths do we use?

9

A Second Look at Fixed Length

10

A Second Look at Fixed Length

10

Letter Binary Letter Binarya 00000 n 01101b 00001 o 01110c 00010 p 01111d 00011 q 10000e 00100 r 10001f 00101 s 10010g 00110 t 10011h 00111 u 10100i 01000 v 10101j 01001 w 10110k 01010 x 10111l 01011 y 11000

m 01100 z 11001

11

Trees!

11

Trees!

Letter Binary

A 00

B 01

C 1

11

Trees!

0 1

Letter Binary

A 00

B 01

C 1

11

Trees!

0 1

C

Letter Binary

A 00

B 01

C 1

11

Trees!

0 1

0 1C

Letter Binary

A 00

B 01

C 1

11

Trees!

0 1

B

0 1C

Letter Binary

A 00

B 01

C 1

11

Trees!

0 1

A B

0 1C

Letter Binary

A 00

B 01

C 1

12

What happens when…?

Letter Binary Letter Binarya 0 n 1b 1 o 0c 0 p 1d 1 q 1e 1 r 0f 0 s 1g 0 t 0h 1 u 0i 1 v 1j 1 w 1k 0 x 1l 1 y 0

m 1 z 0

12

What happens when…?

Letter Binary Letter Binarya 0 n 1b 1 o 0c 0 p 1d 1 q 1e 1 r 0f 0 s 1g 0 t 0h 1 u 0i 1 v 1j 1 w 1k 0 x 1l 1 y 0

m 1 z 0

• Rule 1: Each leaf only has 1 label

What happens when…?

13

Letter Binary

A 00

B 0

What happens when…?

13

• Rule 2: Only leaves get labels

Letter Binary

A 00

B 0

An Optimal Encoding

14

An Optimal Encoding

14

• Start with a tree

0 1

A B

0 1C

An Optimal Encoding

14

• Start with a tree

• What kinds of things do we want to encode with this?

0 1

A B

0 1C

An Optimal Encoding

14

• Start with a tree

• What kinds of things do we want to encode with this?

• What letter do we want to appear the most?

0 1

A B

0 1C

An Optimal Encoding

14

• Start with a tree

• What kinds of things do we want to encode with this?

• What letter do we want to appear the most?

• How about the least?

0 1

A B

0 1C

An Optimal Encoding

14

• Start with a tree

• What kinds of things do we want to encode with this?

• What letter do we want to appear the most?

• How about the least?

• This is called a Huffman Encoding

0 1

A B

0 1C

Huffman Encoding

15

Huffman Encoding

15

• Let’s pretend we want to come up with the optimal encoding:

Huffman Encoding

15

• Let’s pretend we want to come up with the optimal encoding:

• AAAAAAAAAABBBBBCCCCCCCDDDDDDDDD

Huffman Encoding

15

• Let’s pretend we want to come up with the optimal encoding:

• AAAAAAAAAABBBBBCCCCCCCDDDDDDDDD

• A appears 10 times

Huffman Encoding

15

• Let’s pretend we want to come up with the optimal encoding:

• AAAAAAAAAABBBBBCCCCCCCDDDDDDDDD

• A appears 10 times

• B appears 5 times

Huffman Encoding

15

• Let’s pretend we want to come up with the optimal encoding:

• AAAAAAAAAABBBBBCCCCCCCDDDDDDDDD

• A appears 10 times

• B appears 5 times

• C appears 7 times

Huffman Encoding

15

• Let’s pretend we want to come up with the optimal encoding:

• AAAAAAAAAABBBBBCCCCCCCDDDDDDDDD

• A appears 10 times

• B appears 5 times

• C appears 7 times

• D appears 9 times

Huffman Encoding

16

Huffman Encoding

16

• Start with the two smallest frequencies

Huffman Encoding

16

• Start with the two smallest frequencies

• A appears 10 times, B appears 5 times, C appears 7 times, D appears 9 times

Huffman Encoding

16

• Start with the two smallest frequencies

• A appears 10 times, B appears 5 times, C appears 7 times, D appears 9 times

A

B

C

D

Huffman Encoding

16

• Start with the two smallest frequencies

• A appears 10 times, B appears 5 times, C appears 7 times, D appears 9 times

A

B

C

D

Huffman Encoding

16

• Start with the two smallest frequencies

• A appears 10 times, B appears 5 times, C appears 7 times, D appears 9 times

A

B

C

D

0 1

A

D

B C

Huffman Encoding

17

Huffman Encoding

17

• Continue…

Huffman Encoding

17

• Continue…

• A appears 10 times, B & C appear a combined 12 times, D appears 9 times

Huffman Encoding

17

• Continue…

• A appears 10 times, B & C appear a combined 12 times, D appears 9 times

0 1

A

D

B C

Huffman Encoding

17

• Continue…

• A appears 10 times, B & C appear a combined 12 times, D appears 9 times

0 1

A

D

B C

Huffman Encoding

17

• Continue…

• A appears 10 times, B & C appear a combined 12 times, D appears 9 times

0 1

A

D

B C

0 1

B C

0 1

A D

Huffman Encoding

18

Huffman Encoding

18

• And finally…

Huffman Encoding

18

• And finally…

0 1

B C

0 1

A D

Huffman Encoding

18

• And finally…

0 1

B C

0 1

A D

Huffman Encoding

18

• And finally…

0 1

B C

0 1

A D

B D

0 1

C

0 1

A

0 1

Huffman Encoding

19

Huffman Encoding

19

• Another example…

Huffman Encoding

19

• Another example…

• AAAAAAAAAABCCD

Huffman Encoding

19

• Another example…

• AAAAAAAAAABCCD

• A appears 10 times

Huffman Encoding

19

• Another example…

• AAAAAAAAAABCCD

• A appears 10 times

• B appears 1 time

Huffman Encoding

19

• Another example…

• AAAAAAAAAABCCD

• A appears 10 times

• B appears 1 time

• C appears 2 times

Huffman Encoding

19

• Another example…

• AAAAAAAAAABCCD

• A appears 10 times

• B appears 1 time

• C appears 2 times

• D appears 1 time

Huffman Encoding

20

Huffman Encoding

20

• Start with the two smallest frequencies

Huffman Encoding

20

• Start with the two smallest frequencies

• A appears 10 times, B appears 1 time, C appears 2 times, D appears 1 time

Huffman Encoding

20

• Start with the two smallest frequencies

• A appears 10 times, B appears 1 time, C appears 2 times, D appears 1 time

A

B

C

D

Huffman Encoding

20

• Start with the two smallest frequencies

• A appears 10 times, B appears 1 time, C appears 2 times, D appears 1 time

A

B

C

D

Huffman Encoding

20

• Start with the two smallest frequencies

• A appears 10 times, B appears 1 time, C appears 2 times, D appears 1 time

A

B

C

D

0 1

A

C

B D

Huffman Encoding

21

Huffman Encoding

21

• Start with the two smallest frequencies

Huffman Encoding

21

• Start with the two smallest frequencies

• A appears 10 times, B & D appear a combined 2 times, C appears 2 times

Huffman Encoding

21

• Start with the two smallest frequencies

• A appears 10 times, B & D appear a combined 2 times, C appears 2 times

0 1

A

C

B D

Huffman Encoding

21

• Start with the two smallest frequencies

• A appears 10 times, B & D appear a combined 2 times, C appears 2 times

0 1

A

C

B D

Huffman Encoding

21

• Start with the two smallest frequencies

• A appears 10 times, B & D appear a combined 2 times, C appears 2 times

0 1

A

C

B D

0 1

C 0 1

B D

A

Huffman Encoding

22

Huffman Encoding

22

• And finally…

Huffman Encoding

22

• And finally…

0 1

C 0 1

B D

A

Huffman Encoding

22

• And finally…

0 1

C 0 1

B D

A

Huffman Encoding

22

• And finally…

0 1

C 0 1

B D

0 1

A

0 1

C 0 1

B D

A