Lec09, Image III (Compression, JPEG), v1.06.pptce.sharif.edu/courses/91-92/2/ce342-1/resources/root/Lectures/Lec0… · Block splitting and DCT Page 7 Multimedia Systems, MahdiAmiri,

Multimedia SystemsMultimedia Systems

Image IIIImage III

(Image Compression, JPEG)(Image Compression, JPEG)

Course PresentationCourse Presentation

(Image Compression, JPEG)(Image Compression, JPEG)

Mahdi Amiri

April 2013

Sharif University of Technology

Image CompressionBasicsBasics

Large amount of data in digital images

File size for a 14 Megapixel color image

42 MB in uncompressed RGB 24bit/pixel format

~ 24 images in a 1GB memory card

~1.5 MB in JPEG (90% quality) format


Multimedia Systems, Mahdi Amiri, Image IIIPage 1


Compression crucial

Different number of techniques available

RLE, LZ, ADPCM, DCT

Choice depends on

Type of image (B/W, Grayscale, Color, Content)

Application (Entertainment, Medial, Real-time)

Image CompressionDPCM for ImagesDPCM for Images


Image CompressionJPEGJPEG

Most commonly used still image compression

method

Image files, cameras, and WWW

Lossy Compression

OriginalOriginal

178 178 KBKB

Q: Q: 5050

37 37 KBKB


(inc. a lossless coding mode too)

Adjustable degree of compression

Tradeoff between storage size and image quality

Typ. Compression ratio: 10:1

(with little perceptible loss in image quality)

Supports a max. image size of 65535x65535

Q: Q: 55

16 16 KBKB

Q: Q: 11

13 13 KBKB

Image CompressionRateRate--Distortion CurveDistortion Curve

R, Rate: Number of bits per symbol (pixel)

D, Distortion: Difference between input and output

Ex. 1: Mean Squared Error (MSE) of the difference between input and

output signal

Ex. 2: Peak Signal-To-Noise Ratio ( PSNR)

Rate–distortion theory was created by

Claude Shannon in his foundational

work on information theory.


Ex. 2: Peak Signal-To-Noise Ratio ( PSNR)

Input: Original image

Output: Reconstructed image

We will talk more about PSNR at:

Topic: Video III, SubTopic: Video Quality Evaluation

A problem to think about:

Given a random variable (here all images of the world) and a distortion measure, what

is the minimum expected distortion achievable at a particular rate?

Equivalently, What is the minimum rate required to achieve a distortion?

…

An optimization problem, one solution: Lloyd Algorithm.


Acronym for the

“Joint Photographic Experts Group”

A sub-groups of ISO/IEC

http://www.jpeg.org/


http://www.jpeg.org/

The group was organized in 1986

First public release date

JPEG part 1 standard, 1992

ISO: International Organization for Standardization, www.iso.org, NGO, since 1947.

IEC: International Electrotechnical Commission, www.iec.ch, NPO/NGO, since 1906.


Pro:

Works well on photographs and paintings of

realistic scenes with smooth variations of tone

and color.

Con: House Test ImageHouse Test Image


Con:

Lossy compression in the typical use �� is not

suitable for certain applications such as medical

imaging.

Not proper for line drawings and other textual

or iconic graphics, where the sharp contrasts

between adjacent pixels can cause noticeable

artifacts.Grass Test ImageGrass Test Image

House Test ImageHouse Test Image

Image CompressionJPEG Encoder StepsJPEG Encoder Steps

Color space transformation: RGB to YCbCrThe representation of the colors in the image is converted from RGB to Y′CBCR, consisting of one luma component (Y'),

representing brightness, and two chroma components, (Cb and Cr), representing color. This step is sometimes skipped.

Chroma subsamplingThe resolution of the chroma data is reduced, usually by a factor of 2. This reflects the fact that the eye is less sensitive to fine color

details than to fine brightness details.

Block splitting and DCT


Block splitting and DCTThe image is split into blocks of 8×8 pixels. For each block, each of the Y, Cb, and Cr data undergoes a discrete cosine transform

(DCT). A DCT is similar to a Fourier transform in the sense that it produces a kind of spatial frequency spectrum.

QuantizationThe amplitudes of the frequency components are quantized. Human vision is much more sensitive to small variations in color or

brightness over large areas than to the strength of high-frequency brightness variations. Therefore, the magnitudes of the high-

frequency components are stored with a lower accuracy than the low-frequency components. The quality setting of the encoder (for

example 50 or 95 on a scale of 0–100 in the Independent JPEG Group's library) affects to what extent the resolution of each

frequency component is reduced. If an excessively low quality setting is used, the high-frequency components are discarded

altogether.

Entropy CodingThe resulting data for all 8×8 blocks is further compressed with a lossless algorithm, a variant of Huffman encoding.

JPEGCodec Diagram, Scheme Codec Diagram, Scheme 11

EncoderEncoder


DecoderDecoder

JPEGEncoder Diagram, Scheme Encoder Diagram, Scheme 22


JPEG encoder diagram for a single block of JPEG encoder diagram for a single block of 8 8 by by 8 8 pixels pixels

JPEGEncoder Diagram, Scheme Encoder Diagram, Scheme 33

Baseline JPEG Baseline JPEG

Encoder Encoder

block diagramblock diagram


JPEGColor Space TransformationColor Space Transformation

RGB to YCbCr conversion concept:

The human eye is less sensitive to fine color (chrominance)

details than to fine brightness (luminance) details.

Analog TVAnalog TV


Digital TVDigital TV

CbCb = B = B –– YY

Cr = R Cr = R -- YY

JPEG, Chroma SubsamplingSubsamplingSubsampling in in YCbCrYCbCr


JPEGBlock Splitting and DCTBlock Splitting and DCT

Block splitting

The image is split into blocks of 8×8 pixels.

Later we discuss why this is done.

Discrete Cosine Transform (DCT)


Discrete Cosine Transform (DCT)

Each 8×8 block of each component (Y, Cb, Cr) is

converted to a frequency-domain representation, using

a normalized, two-dimensional type-II discrete cosine

transform (DCT).

JPEG, DCTCenter Around ZeroCenter Around Zero


The The 88××8 8 subsub--image shown image shown

in in 88--bit grayscalebit grayscale

JPEG, DCTFourier CoefficientsFourier Coefficients


squaresquare--wave synthesized using Fourier cosine coefficients and sine coefficientswave synthesized using Fourier cosine coefficients and sine coefficients

DCTBasis FunctionsBasis Functions

The DCT transforms an 8×8 block of

input values to a linear combination

of these 64 patterns. The patterns are

referred to as the two-dimensional

DCT basis functions, and the output


DCT basis functions, and the output

values are referred to as transform coefficients. The horizontal index is u

and the vertical index is v.

The The 88××8 8

subsub--imageimage

JPEG, DCTIllustration of DCTIllustration of DCT


JPEG, DCTDCT CoefficientsDCT Coefficients

DC coefficient ( Top-left corner, has large magnitude )

AC coefficients ( Other 63 coefficients )

DCT aggregates most of the signal in one corner

Larger values in the top-left corner


DCT coefficient for our sample block (rounded to the nearest two digits beyond the decimal point)DCT coefficient for our sample block (rounded to the nearest two digits beyond the decimal point)

JPEGDCT Coefficients, ExampleDCT Coefficients, Example


The result of taking the DCT. The numbers in red are the The result of taking the DCT. The numbers in red are the

coefficients that fall below the specified threshold of coefficients that fall below the specified threshold of 1010..

JPEG, DCTHistograms of DCT CoefficientsHistograms of DCT Coefficients

Histograms of DCT Histograms of DCT

Coefficients of image Coefficients of image

‘‘lenalena’ using blocks of ’ using blocks of

88××8 8 pixelspixels


JPEG, QuantizationConceptConcept

The human eye is good at seeing small

differences in brightness over a relatively large

area, but not so good at distinguishing the exact

strength of a high frequency brightness variation.

Small quantization step for low frequency


Small quantization step for low frequency

components (Top-left corner in DCT

coefficients matrix )

Big quantization step for high frequency

components (Bottom-right corner in DCT

coefficients matrix )DCT coefficientDCT coefficient

Sample ImagesSample Images

JPEG, QuantizationQuantization MatrixQuantization Matrix

A typical quantization matrix, as specified in the original

JPEG Standard


G is the G is the unquantizedunquantized DCT coefficientsDCT coefficients

Q is the quantization matrixQ is the quantization matrix

B is the quantized DCT coefficientsB is the quantized DCT coefficients

JPEG, QuantizationSample OutputSample Output


Many of the higher frequency components are rounded

to zero

Quantized DCT coefficient for our sample block Quantized DCT coefficient for our sample block

JPEG, Quantization


JPEG, Entropy CodingZigzag OrderingZigzag Ordering

DC Coefficient: DPCM

AC Coefficients

Run-length encoding ( RLE )

Then using Huffman coding

on the whole sequence of numbers


on the whole sequence of numbers

JPEGEncoder ExampleEncoder Example


JPEGDecoder ExampleDecoder Example


JPEGCompression RatioCompression Ratio


OriginalOriginal JPEG CompressedJPEG Compressed

Quality setting of Quality setting of 5050

DifferenceDifference

((Darker means a larger

difference)

JPEGMWIPCMWIPC


MWIPC, Testing DPCM and DCT based image compressionMWIPC, Testing DPCM and DCT based image compression

JPEGBlocking ArtifactBlocking Artifact


OriginalOriginal JPEG CompressedJPEG Compressed

Quality setting of Quality setting of 55

JPEG, Block SplittingBlocks of Blocks of 8 8 by by 8 8 PixelsPixels

Why Blocking?

Neighboring pixels are more correlated

Lower computational complexity

The computational complexity for 2D DCT of an

PaddingIf the data for a channel does not represent

an integer number of blocks then the

encoder must fill the remaining area of the

incomplete blocks with some form of

dummy data.


The computational complexity for 2D DCT of an

N by N image is:

, while the complexity of 2D DCT of all N/8 by

N/8 blocks of image is:

( )2

2logO N N

( ) ( )2

2 2

228 log 8

8

NO O N=

What about blocks of What about blocks of 1616××16 16 pixels? pixels?

JPEG, Block SplittingLarger BlocksLarger Blocks

Pro: Less blocking artifact

Con:

Less Correlated data inside the block

Higher computational complexity

Efficiency as a function of block size Efficiency as a function of block size

NN××N, measured for N, measured for 8 8 bit quantization bit quantization

in the original domain and equivalent in the original domain and equivalent

quantization in the transform domain.quantization in the transform domain.


Block size Block size 88××8 8 is a good is a good

compromise between coding compromise between coding

efficiency and complexityefficiency and complexity

JPEG, Quantization MatrixQuality FactorQuality Factor

The quality setting of the encoder (for example 50 or 95

on a scale of 0–100 in the Independent JPEG Group's

library) affects to what extent the resolution of each

frequency component is reduced.

For a quality of 100%, the quantization tables should be


For a quality of 100%, the quantization tables should be

setup such that all entries are one. For a quality factor of

50%, the ITU/ISO recommended tables are recommended,

but any other choice is also valid. For a quality between

50% and 100%, one may interpolate between the quality

factor given for 50%, and that for 100% (i.e. 1.0)

Thank You

Multimedia SystemsMultimedia Systems

Image III (Compression, JPEG)Image III (Compression, JPEG)


Thank You

1. http://ce.sharif.edu/~m_amiri/

2. http://www.dml.ir/

FIND OUT MORE AT...

Next Session: Video INext Session: Video I

Documents

Lec09, Image III (Compression, JPEG), v1.06.pptce.sharif.edu/courses/91-92/2/ce342-1/resources/root/Lectures/Lec0… · Block splitting and DCT Page 7 Multimedia Systems, MahdiAmiri,