4
CSE 126 Multimedia Systems P. Venkat Rangan Spring 2003 Lecture Note 5 (April 15) JPEG Encoding There are four main steps in the JPEG encoding scheme. 1. Picture Preparation o Separate the Y,U, and V components (planes) o Subsample both U & V by 4x4 pixel regions (i.e. each 4x4 region becomes 1 mega-pixel) o Split each plane into 8x8 blocks (64 pixels for Y, 64 mega-pixels for U/V) 2. DCT (Discrete Cosine Transform) o Transform encoding to reduce the size of bits required to represent each 8x8 block o 64 coefficients (1 DC coefficient, 63 AC coefficients) produced 3. Quantization o Non-uniform quantization applied to the DCT coefficients (higher resolution given to DC and low frequency coefficients) o Usually results in most of the higher frequency coefficients quantizing to a value of 0. 4. Entropy Encoding o Used to further reduce the amount of space required to store the JPEG image o Run length coding can be used on the long sequences of zeroes produced by the DCT DCT There are many transforms, most of which are very slow. This is important to consider since video demands real-time encoding and decoding. The JPEG committee took suggestions and empirically studied the use of several different transforms. Of the transforms studied, DCT (Discrete Cosine Transform) proved superior. In JPEG, DCT operates on one block at a time. Because there are 64 elements in an 8x8 block, this is called the 64-element or 64-coefficient DCT. The DCT transform operates on this block in a left-to- right, top-to-bottom manner.

Lecture 5

Embed Size (px)

Citation preview

Page 1: Lecture 5

CSE 126 Multimedia Systems P. Venkat Rangan Spring 2003 Lecture Note 5 (April 15)

JPEG Encoding

There are four main steps in the JPEG encoding scheme. 1. Picture Preparation

o Separate the Y,U, and V components (planes) o Subsample both U & V by 4x4 pixel regions (i.e. each 4x4 region becomes 1

mega-pixel) o Split each plane into 8x8 blocks (64 pixels for Y, 64 mega-pixels for U/V)

2. DCT (Discrete Cosine Transform) o Transform encoding to reduce the size of bits required to represent each 8x8 block o 64 coefficients (1 DC coefficient, 63 AC coefficients) produced

3. Quantization o Non-uniform quantization applied to the DCT coefficients (higher resolution

given to DC and low frequency coefficients) o Usually results in most of the higher frequency coefficients quantizing to a value

of 0. 4. Entropy Encoding

o Used to further reduce the amount of space required to store the JPEG image o Run length coding can be used on the long sequences of zeroes produced by the

DCT

DCT

There are many transforms, most of which are very slow. This is important to consider since video demands real-time encoding and decoding. The JPEG committee took suggestions and empirically studied the use of several different transforms. Of the transforms studied, DCT (Discrete Cosine Transform) proved superior.

In JPEG, DCT operates on one block at a time. Because there are 64 elements in an 8x8 block, this is called the 64-element or 64-coefficient DCT. The DCT transform operates on this block in a left-to- right, top-to-bottom manner.

Page 2: Lecture 5

DCT

8x8 pixel block

8x8 block of DCT

coefficients

Formula for FDCT:

∗∗+∗

∗∗+∗∗∗= ∑∑

= = 16)12(cos

16)12(cos

41 7

0

7

0,,

ππ jxiyPCCSx y

yxjiji

where

2

1=iC when i = 0

21=jC when j = 0

1, =ji CC otherwise Px,y = pixel (or mega-pixel) value at location x, y in the 8x8 block

Notes about DCT:

The results of a 64-element DCT transform are 1 DC coefficient and 63 AC coefficients. The DC coefficient represents the average color of the 8x8 region. The 63 AC coefficients represent color change across the block. Low-numbered coefficients represent low-frequency color change, or gradual color change across the region. High-numbered coefficients represent high-frequency color change, or color which changes rapidly from one pixel to another within the block. These 64 results are written in a zig-zag order as follows, with the DC coefficient followed by AC coefficients of increasing frequency.

Zig-Zag sequencing:

Page 3: Lecture 5

Note that each diagonal line in this zig-zag sequence contains AC coefficients whose sum is constant. For example, the coefficients {30, 21, 12, 03} all add to 3.

Why is this ordering important? Well, if you think of a block of 8x8 pixels out of a coherent image, the pixels are likely to be very similar. If you run DCT on 64 pixels which are very similar, you will get a DC coefficient and some values for the low-frequency AC coefficients; the remaining coefficients will likely be at or near zero. Try to imagine creating an image out of pixels that wildly vary from their neighbors. The resulting image will more than likely not make much sense, it will just be a mess of dots. To give you an idea of how small an 8x8 region is, consider the following example:

The 8x8 region of pixels highlighted above looks like this (magnified 1600 times)

Page 4: Lecture 5

As you can see, this region does not deviate much from its average color. In addition, the change is slow and gradual across the block rather than sharp and abrupt from pixel to pixel.

This observation about images allows us to place a much greater importance on the DC and first few AC coefficients (beginning of zig-zag sequence) and it also allows us to assume there will be little or no values in the high-frequency AC coefficients (remainder of sequence).

Logically, if these values are of little importance we should be able to assign fewer bits to them in order to achieve greater compression. This naturally leads us to the stages of quantization and entropy encoding, which we will cover next time.

Two examples for DCT.

Example 1 we have a block of 8*8 with each pixel of red color

Using the formula for DCT we get

C0,0 = 1/8 * p ( p is a constant)

for all other i,j the cosine values cancel each other thus the i,j is zero

Example 2 we have a block of 8*8, the left half side of the block is red, the right half side of the block is blue.