Upload
erica-ramsey
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Lossy Compression
15-211 Fundamental Data Structures and Algorithms
Peter Lee
February 18, 2003
Announcements
• Homework #4 is available Due on Monday, March 17, 11:59pm Get started now!
• Quiz #2 Available on Tuesday, Feb.25 Some questions will be easier if you
have some parts of HW4 working
• Read Chapter 8
HW4 is out!
Before we begin…
Eliza
• Eliza was one of the first AI programs J. Weizenbaum, 1966
• At the time, it impressed people who used it
• Eliza has been implemented many, many times gnu emacs has one try “M-x doctor”
Eliza’s impact
• Many stories of Eliza’s impact some people became so dependent that
Weizenbaum eventually had to withdraw its use
some psychiatrists saw Eliza as a way for the profession to handle many more patients• Eliza might be used for most patients, and
the human doctor reserved for only the most serious cases
Eliza’s rules
• Eliza is a remarkably simple program
• Some sample rules: X me Y X you Y I remember X Why
do you remember X just now? My <family-member> is X Who
else in your family is X? X <family-member> Y Tell
me more about your family X That is very interesting
Why “Eliza”?
• The name was chosen for its ability to converse increasingly well
• The Greek legend of Pygmalion the mysogynist King of Cyprus fell in love with an ivory statue, Galatea taking pity, Aphrodite made Galatea
come alive Pygmalion then married Galatea
Why “Eliza”? – cont’d
• George Bernard Shaw wrote a play, Pygmalion, based on the legend Professor Higgins creates a “lady” from
a low-class cockney flower vendor, Eliza Doolittle
first filmed in 1938
• Later, adapted into the “politically correct” My Fair Lady
Wrap-Up onLZW Compression
Byte method LZW
• We start with a trie that contains a root and n children one child for each possible character each child labeled 0…n
• When we compress as before, by walking down the trie but, after emitting a code and growing
the trie, we must start from the root’s child labeled c, where c is the character that caused us to grow the trie
LZW: Byte method example
• Suppose our entire character set consists only of the four letters: {a, b, c, d}
• Let’s consider the compression of the string baddad
Byte LZW: Compress example
baddadInput:^
a bDictionary:
Output:
10 32
c d
Byte LZW: Compress example
baddadInput:^
a bDictionary:
Output:
10 32
c d
1
4
a
Byte LZW: Compress example
baddadInput:^
a bDictionary:
Output:
10 32
c d
10
4
a
5
d
Byte LZW: Compress example
baddadInput:^
a bDictionary:
Output:
10 32
c d
103
4
a
5
d
6
d
Byte LZW: Compress example
baddadInput:^
a bDictionary:
Output:
10 32
c d
1033
4
a
5
d
6
d
7
a
Byte LZW: Compress example
baddadInput:^
a bDictionary:
Output:
10 32
c d
10335
4
a
5
d
6
d
7
a
Byte LZW output
• So, the input baddad
• compresses to 10335
• which again can be given in bit form, just like in the binary method…
• …or compressed again using Huffman
Byte LZW: Uncompress example
• The uncompress step for byte LZW is the most complicated part of the entire process, but is largely similar to the binary method
Byte LZW: Uncompress example
10335Input:^
a bDictionary:
Output:
10 32
c d
Byte LZW: Uncompress example
10335Input:^
a bDictionary:
Output:
10 32
c d
b
Byte LZW: Uncompress example
10335Input:^
a bDictionary:
Output:
10 32
c d
ba
4
a
Byte LZW: Uncompress example
10335Input:^
a bDictionary:
Output:
10 32
c d
bad
4
a
5
d
Byte LZW: Uncompress example
10335Input:^
a bDictionary:
Output:
10 32
c d
badd
4
a
5
d
6
d
Byte LZW: Uncompress example
10335Input:^
a bDictionary:
Output:
10 32
c d
baddad
4
a
5
d
6
d
7
a
LZW applications
• LZW is an extremely useful lossless method for compressing data
• LZW is used in the GIF and compressed TIFF standards for image data
• Unisys holds the patent on LZW, but allows free noncommercial use
Quiz Break
LZW performance
• Suppose we have a file of N a’s: aaaa…a
1. What would the output look like after LZW compression?
2. What, roughly, is the size of the output (in big-Oh terms)?
3. How big would the output be if we used Huffman instead?
Lossy Compression
Lossy compression
• Often, we can tolerate some loss of data through the compress/ decompress cycle.
• Images, and especially video/audio, can be huge HDTV bit rate is >1Gbps! Big problem for storage and network
Techniques
• Lossy compression is based on mathematical transformations
• Discrete Cosine Transform (DCT) Used in JPEG algorithm
• Wavelet based image compression Used in MPEG-4
• Many, many, others…
Image files
One-dimensional array of 3 width height bytes
Three two-dimensional arrays, one for each color component
Consider this color image
This is part of a famous image
(Do you know who? Hint: Splay)
The image is a 16x16 bitmap image enlarged
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
Here is the Red part of the image
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
Green Part
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
Blue Part
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
The red image, again
173 165 165 165 148 132 123 132 140 156 173 181 181 181 189 173
198 189 189 189 181 165 148 165 165 173 181 198 206 198 181 165
206 206 206 206 198 189 181 181 198 206 206 222 231 214 181 165
231 222 206 198 189 181 181 181 206 222 222 222 231 222 198 181
231 214 189 173 165 165 173 181 181 189 198 222 239 231 206 214
206 189 173 148 148 148 148 165 156 148 165 198 222 231 214 239
181 165 140 123 123 115 115 123 140 148 140 148 165 206 239 247
165 82 66 82 90 82 90 107 123 123 115 132 140 165 198 231
123 198 74 49 57 82 82 99 107 115 115 123 132 132 148 214
239 239 107 82 82 74 90 107 123 115 115 123 115 115 123 198
255 90 74 74 99 74 115 123 132 123 123 115 115 140 165 189
247 99 99 82 90 107 123 123 123 123 123 132 140 156 181 198
247 239 165 132 107 148 140 132 132 123 132 148 140 140 156 214
198 231 165 156 132 156 156 140 140 140 148 148 132 140 156 222
247 239 222 181 181 140 156 140 148 148 148 140 132 156 206 222
214 198 181 181 181 181 173 148 156 148 140 140 165 198 222 239
Byte values (0…255) indicate intensity of the color at each pixel
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
173 165 165 165 148 132 123 132 140 156 173 181 181 181 189 173
198 189 189 189 181 165 148 165 165 173 181 198 206 198 181 165
206 206 206 206 198 189 181 181 198 206 206 222 231 214 181 165
231 222 206 198 189 181 181 181 206 222 222 222 231 222 198 181
231 214 189 173 165 165 173 181 181 189 198 222 239 231 206 214
206 189 173 148 148 148 148 165 156 148 165 198 222 231 214 239
181 165 140 123 123 115 115 123 140 148 140 148 165 206 239 247
165 82 66 82 90 82 90 107 123 123 115 132 140 165 198 231
123 198 74 49 57 82 82 99 107 115 115 123 132 132 148 214
239 239 107 82 82 74 90 107 123 115 115 123 115 115 123 198
255 90 74 74 99 74 115 123 132 123 123 115 115 140 165 189
247 99 99 82 90 107 123 123 123 123 123 132 140 156 181 198
247 239 165 132 107 148 140 132 132 123 132 148 140 140 156 214
198 231 165 156 132 156 156 140 140 140 148 148 132 140 156 222
247 239 222 181 181 140 156 140 148 148 148 140 132 156 206 222
214 198 181 181 181 181 173 148 156 148 140 140 165 198 222 239
The red image, again
Byte values (0…255) indicate intensity of the color at each pixel
JPEG
JPEG
• Joint Photographic Expert Group Voted as international standard in 1992 Works well for both color and grayscale
images
• Many steps in the algorithm Some requiring sophistication in
mathematics
• We’ll skip many parts and focus on just the main elements of JPEG
JPEG in a nutshell
BG
RY I
QRGB to YIQ(optional)
for each plane(scan)
for each 8x8 blockDCTQuantZig-zag
DPCM
RLE
Huffman 11010001…
JPEG in a nutshell
BG
RY I
QRGB to YIQ(optional)
for each plane(scan)
for each 8x8 blockDCTQuantZig-zag
DPCM
RLE
Huffman 11010001…
Linear transform coding
• For video, audio, or images, one key first step of the compression will be to encode values over regions of time or space
• The basic strategy is to select a set of linear basis functions i that span the space sin, cos, wavelets, … defined at discrete points
Linear transform coding
• Coefficients:
• In matrix notation:
Where A is an nxn matrix, and each row defines a basis function
Cosine transform
Discrete Cosine Transform
• DCT separates the image into spectral sub-bands of differing importance
• With input image A, the output coefficients B are given by the following equation:
N1 and N2 give the image’s height and width
Basis functions
JPEG in a nutshell
BG
RY I
QRGB to YIQ(optional)
for each plane(scan)
for each 8x8 blockDCTQuantZig-zag
DPCM
RLE
Huffman 11010001…
Quantization
• The purpose of quantization is to encode an entire region of values into a single value For example, can simply delete low-
order bits:• 101101 could be encoded as 1011 or 101
When dividing by power-of-two, this amounts to deleting whole bits
Other division constants give finer control over bit loss
• JPEG uses a standard quantization table
JPEG quantization table
q =
Each B(k1,k2) is divided by q(k1,k2).Eye is most sensitive to low frequencies (upper-left).
JPEG in a nutshell
BG
RY I
QRGB to YIQ(optional)
for each plane(scan)
for each 8x8 blockDCTQuantZig-zag
DPCM
RLE
Huffman 11010001…
Zig-zag scan
• Purpose is to convert 8x8 block into a 1x64 vector, with low-frequency coefficients at the front
JPEG in a nutshell
BG
RY I
QRGB to YIQ(optional)
for each plane(scan)
for each 8x8 blockDCTQuantZig-zag
DPCM
RLE
Huffman 11010001…
Final stages
• The DPCM (differential pulse code modulation) and RLE (run length encoding) steps take advantage of a common characteristic of many images: An 8x8 block is often not too different
than the previous one Within a block, there are often long
sequences of zeros
Example: GIF
472KB
Example: JPEG at max quality
378KB
Example: JPEG at 50%
62KB
Example: JPEG at 25%
47KB
Example: JPEG at min quality
28KB
SVD
Matrix decomposition
• Suppose A is an mn matrix, e.g.:
• We can decompose A into three matrices, U, S, and V, such that
A = 120 100 120 100 10 10 10 10 60 60 70 80 150 120 150 150
A = USVT
Decomposition example
A = 120 100 120 100 10 10 10 10 60 60 70 80 150 120 150 150
U = 0.5709 -0.6772 -0.4532 0.1009 0.0516 -0.0005 -0.1539 -0.9867 0.3500 0.7121 -0.5984 0.1113 0.7409 0.1854 0.6425 -0.0615
S = 386.154 0 0 0 0 20.6541 0 0 0 0 7.5842 0 0 0 0 0.9919
V = 0.5209 -0.5194 0.6004 -0.3137 0.4338 -0.1330 -0.7461 -0.4873 0.5300 -0.1746 -0.1886 0.8081 0.5095 0.8259 0.2176 -0.1049
Orthonormal:UUT = I
Orthonormal:VVT = I
Diagonal, with decreasing singular values
Singular value decomposition
• Such a factoring of a matrix, or decomposition is a called an SVD.
• Exactly how to find U, V, and S is beyond the scope of this course. But you’ll find out in your matrix/linear
algebra course… Note: Very important also for
graphics/animation algorithms
So what about compression?
• Let: si be the ith eigen value in S Ui be the ith column in U Vi be the ith column in V
• Then, another formula for matrix A is
A = s1 U1V1T + s2 U2V2
T + ….+ sK UKVKT
s1
U1
V1
A = 120 100 120 100 10 10 10 10 60 60 70 80 150 120 150 150
U = 0.5709 -0.6772 -0.4532 0.1009 0.0516 -0.0005 -0.1539 -0.9867 0.3500 0.7121 -0.5984 0.1113 0.7409 0.1854 0.6425 -0.0615
S = 386.154 0 0 0 0 20.6541 0 0 0 0 7.5842 0 0 0 0 0.9919
V = 0.5209 -0.5194 0.6004 -0.3137 0.4338 -0.1330 -0.7461 -0.4873 0.5300 -0.1746 -0.1886 0.8081 0.5095 0.8259 0.2176 -0.1049
SVD example
A1 = s1U1V1T
= 115 96 117 112
10 9 11 10
70 59 72 69
149 124 152 146This is called the “rank-1 approximation
Let’s form a rank-1 sum
A1 = s1 U1 V1T
• A1 = 115 96 117 112 10 9 11 10 70 59 72 69 149 124 152 146
• Error Matrix |A - A1| is 5 4 3 12 0 1 1 0 10 1 2 11 1 4 2 4
Relatively small with a rank-1 approximation.
What do we learn here?
• To compute A1 we only need: just one column from U, just one column from V, and just one singular value
• And we get: a pretty good approximation to the
original matrix 9 bytes instead of 16
• A big savings in storage!
How about a rank-2 approximation?
A2 = s1 U1 V1T + s2 U2 V2
T
• We get A2 = 122 98 119 100 10 9 11 10 62 57 69 81 147 123 151 149
• Error Matrix |A - A2| 2 2 1 0 0 1 1 0 2 3 1 1 3 3 1 1
Analysis
To get an idea of how close the approximation to the original matrix is, we can calculate:
• Mean of Rank1 error matrix =3.8125
• Mean of Rank2 error matrix =1.3750 Where mean is the average of the all entries
• We really don’t gain much by calculating the rank-2 approximation (why?)
SVD example
A = 120 100 120 100 10 10 10 10 60 60 70 80 150 120 150 150
U = 0.5709 -0.6772 -0.4532 0.1009 0.0516 -0.0005 -0.1539 -0.9867 0.3500 0.7121 -0.5984 0.1113 0.7409 0.1854 0.6425 -0.0615
S = 386.154 0 0 0 0 20.6541 0 0 0 0 7.5842 0 0 0 0 0.9919
V = 0.5209 -0.5194 0.6004 -0.3137 0.4338 -0.1330 -0.7461 -0.4873 0.5300 -0.1746 -0.1886 0.8081 0.5095 0.8259 0.2176 -0.1049
First eigen value is significantly larger than the rest
Observation
• The contribution from the rank-1 sum is very significant compared to the sum of all other rank approximations.
• So even if you leave out all other rank sums, you still get a pretty good approximation with just two vectors.
Some samples (128x128)
Original mage 49K Rank 1 approx 311 bytes
Samples cont’d…
Rank 16 approx 13K Rank 8 approx 7K
Some size observations
• Note that theoretically the sizes of the compressed images should be Rank 1 = 54 + (128 + 128 + 1)*3
Rank 8 = 54 + (128+128+1)*3*8 = 6K Rank 16 = 54 + (128 + 128 + 1)*3*16 = 12K Rank 32 = 54 + (128 + 128 + 1)*3*32 = 24K Rank 64 = 48K (pretty close to the original)
Bmp Header U1 V1 + S1 bytes/pixel
Matlab Code for SVD
• Matlab is a computer algebra system (www.mathworks.com)
• Here is Matlab code that can perform SVD on an image. A=imread('c:\temp\rhino64','bmp'); N = size(A)[1]; R = A(:,:,1); // extract Red matrix G = A(:,:,2); // extract Green Matrix B = A(:,:,3); // extract blue matrix Apply SVD to each of the matrices
• [ur,sr,vr]=svd(double(R));• [ug,sg,vg]=svd(double(G));• [ub,sb,vb]=svd(double(B));
Complete Matlab Code for SVD ctd..
A=imread('c:\temp\rosemary','bmp');s=size(A)%imagesc(A)R = A(:,:,1); G = A(:,:,2); B = A(:,:,3);[ur,sr,vr]=svd(double(R));[ug,sg,vg]=svd(double(G));[ub,sb,vb]=svd(double(B));%initialize matrices to zero matricesRk=zeros(s(1),s(2));Gk=zeros(s(1),s(2));Bk=zeros(s(1),s(2));k = 8; % k is the desired rank% form the rank sumsfor i=1:k, Rk=Rk + sr(i,i)*ur(:,i)*transpose(vr(:,i)); endfor i=1:k, Gk=Gk + sg(i,i)*ug(:,i)*transpose(vg(:,i)); endfor i=1:k, Bk=Bk + sb(i,i)*ub(:,i)*transpose(vb(:,i)); end% Now form the rank-k approximation of AAk = A;Ak(:,:,1)=Rk; Ak(:,:,2)=Gk; Ak(:,:,3)=Bk;% now plot the rank-k approximation of imageimagesc(Ak)
Matlab outputs
Rank-4
Rank-4 Rank-8
Rank-1
Rank-1
Rank-8
Original images were approximately 128x128
original
original
Adaptive rank methods
• All popular image compression programs apply compression algorithm to sub-blocks of the image
• This exploits the uneven characteristics of the original image
• If parts of the image are less complex than the others, then a smaller number of singular values are needed to obtain a "close" approximation
Adaptive Rank Methods ctd..
• So instead of picking same rank for each sub-block, we decide how many singular values to pick from each sub-block by looking at the following:
• Percent of r values
= s1 + s2 + ….+ sr
s1 + s2 + ….+ sk
Where k is the max number of non zero singular values of A.
Results of Adaptive Ranking Method
• We applied the adaptive ranking method to Danny Sleator. Here are the results.
80% of singular
values
26K
Original
49K
50% of singular
values
15K
10% of singular
values
14K