8
DATA COMPRESSION Simple Dictionary Compression Manish T I

Simple Dictionary Compression

Embed Size (px)

Citation preview

Page 1: Simple Dictionary Compression

DATA COMPRESSION

Simple Dictionary Compression

Manish T I

Page 2: Simple Dictionary Compression

• It is a two pass algorithm in which first pass analyze the data in the source file and second pass will compress the data to a file.

First Pass:-

• In the source file distinct bytes are identified.• Check the number of times it occurs in the source

file.• A new list is sorted in descending order of the

frequencies, in such a manner in which higher count of byte (alphabets) appear at the top of the list which is known as the dictionary.

Page 3: Simple Dictionary Compression

Second Pass:-

• The source file is read again byte by byte • Each byte is located in the dictionary by a direct

search and its index is noted.• Index value is written on the compressed file,

preceded by its length. • The index value consist of 256 values and range

spans from 0 to 255. • The index is written on the compressed file,

preceded by a 3-bit code denoting the index’s length.

Page 4: Simple Dictionary Compression

• Index Table

Binary Value

Value Bit

000 0 1

001 1 2

010 2 3

011 3 4

100 4 5

101 5 6

110 6 7

111 7 8

Page 5: Simple Dictionary Compression

Input File sample data : - TTVVVEGTVEN

Dictionary File : -

Page 6: Simple Dictionary Compression

Compressed File (4 – 11 bits)

T

V

V

V

E

G

T

V

E

N

0 0 1 1 00 0 0 10 0 0 1

0 0 0 1

0 0 1 1 10 1 0 1 0 0

0 0 1 1 00 0 0 10 0 1 1 10 1 0 1 0 1

No: of bits used

5

4

4

4

5

6

5

4

5

6

Page 7: Simple Dictionary Compression

• Compression is achieved because the dictionary is sorted by the frequency of the bytes. Each byte is replaced by a quantity of between 4 and 11 bits.

• Dictionary is not sorted by byte values.

• Disadvantage :- Slow compression not in the case of decompression.

Page 8: Simple Dictionary Compression

Reference:-

Data Compression : The Complete Reference, David Salomon, Springer Science & Business Media, 2004

For any queries contact: Web: www.iprg.co.inE-mail: [email protected]: @ImageProcessingResearchGroup