Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Multimedia SystemsMultimedia Systems
Image IIIImage III
(Image Compression, JPEG)(Image Compression, JPEG)
Course PresentationCourse Presentation
(Image Compression, JPEG)(Image Compression, JPEG)
Mahdi Amiri
April 2013
Sharif University of Technology
Image CompressionBasicsBasics
Large amount of data in digital images
File size for a 14 Megapixel color image
42 MB in uncompressed RGB 24bit/pixel format
~ 24 images in a 1GB memory card
~1.5 MB in JPEG (90% quality) format
~ 667 images in a 1GB memory card
Multimedia Systems, Mahdi Amiri, Image IIIPage 1
~ 667 images in a 1GB memory card
Compression crucial
Different number of techniques available
RLE, LZ, ADPCM, DCT
Choice depends on
Type of image (B/W, Grayscale, Color, Content)
Application (Entertainment, Medial, Real-time)
Image CompressionDPCM for ImagesDPCM for Images
Multimedia Systems, Mahdi Amiri, Image IIIPage 2
Image CompressionJPEGJPEG
Most commonly used still image compression
method
Image files, cameras, and WWW
Lossy Compression
OriginalOriginal
178 178 KBKB
Q: Q: 5050
37 37 KBKB
Multimedia Systems, Mahdi Amiri, Image IIIPage 3
(inc. a lossless coding mode too)
Adjustable degree of compression
Tradeoff between storage size and image quality
Typ. Compression ratio: 10:1
(with little perceptible loss in image quality)
Supports a max. image size of 65535x65535
Q: Q: 55
16 16 KBKB
Q: Q: 11
13 13 KBKB
Image CompressionRateRate--Distortion CurveDistortion Curve
R, Rate: Number of bits per symbol (pixel)
D, Distortion: Difference between input and output
Ex. 1: Mean Squared Error (MSE) of the difference between input and
output signal
Ex. 2: Peak Signal-To-Noise Ratio ( PSNR)
Rate–distortion theory was created by
Claude Shannon in his foundational
work on information theory.
Multimedia Systems, Mahdi Amiri, Image IIIPage 4
Ex. 2: Peak Signal-To-Noise Ratio ( PSNR)
Input: Original image
Output: Reconstructed image
We will talk more about PSNR at:
Topic: Video III, SubTopic: Video Quality Evaluation
A problem to think about:
Given a random variable (here all images of the world) and a distortion measure, what
is the minimum expected distortion achievable at a particular rate?
Equivalently, What is the minimum rate required to achieve a distortion?
…
An optimization problem, one solution: Lloyd Algorithm.
Image CompressionJPEGJPEG
Acronym for the
“Joint Photographic Experts Group”
A sub-groups of ISO/IEC
http://www.jpeg.org/
Multimedia Systems, Mahdi Amiri, Image IIIPage 5
http://www.jpeg.org/
The group was organized in 1986
First public release date
JPEG part 1 standard, 1992
ISO: International Organization for Standardization, www.iso.org, NGO, since 1947.
IEC: International Electrotechnical Commission, www.iec.ch, NPO/NGO, since 1906.
Image CompressionJPEGJPEG
Pro:
Works well on photographs and paintings of
realistic scenes with smooth variations of tone
and color.
Con: House Test ImageHouse Test Image
Multimedia Systems, Mahdi Amiri, Image IIIPage 6
Con:
Lossy compression in the typical use ���� is not
suitable for certain applications such as medical
imaging.
Not proper for line drawings and other textual
or iconic graphics, where the sharp contrasts
between adjacent pixels can cause noticeable
artifacts.Grass Test ImageGrass Test Image
House Test ImageHouse Test Image
Image CompressionJPEG Encoder StepsJPEG Encoder Steps
Color space transformation: RGB to YCbCrThe representation of the colors in the image is converted from RGB to Y′CBCR, consisting of one luma component (Y'),
representing brightness, and two chroma components, (Cb and Cr), representing color. This step is sometimes skipped.
Chroma subsamplingThe resolution of the chroma data is reduced, usually by a factor of 2. This reflects the fact that the eye is less sensitive to fine color
details than to fine brightness details.
Block splitting and DCT
Multimedia Systems, Mahdi Amiri, Image IIIPage 7
Block splitting and DCTThe image is split into blocks of 8×8 pixels. For each block, each of the Y, Cb, and Cr data undergoes a discrete cosine transform
(DCT). A DCT is similar to a Fourier transform in the sense that it produces a kind of spatial frequency spectrum.
QuantizationThe amplitudes of the frequency components are quantized. Human vision is much more sensitive to small variations in color or
brightness over large areas than to the strength of high-frequency brightness variations. Therefore, the magnitudes of the high-
frequency components are stored with a lower accuracy than the low-frequency components. The quality setting of the encoder (for
example 50 or 95 on a scale of 0–100 in the Independent JPEG Group's library) affects to what extent the resolution of each
frequency component is reduced. If an excessively low quality setting is used, the high-frequency components are discarded
altogether.
Entropy CodingThe resulting data for all 8×8 blocks is further compressed with a lossless algorithm, a variant of Huffman encoding.
JPEGCodec Diagram, Scheme Codec Diagram, Scheme 11
EncoderEncoder
Multimedia Systems, Mahdi Amiri, Image IIIPage 8
DecoderDecoder
JPEGEncoder Diagram, Scheme Encoder Diagram, Scheme 22
Multimedia Systems, Mahdi Amiri, Image IIIPage 9
JPEG encoder diagram for a single block of JPEG encoder diagram for a single block of 8 8 by by 8 8 pixels pixels
JPEGEncoder Diagram, Scheme Encoder Diagram, Scheme 33
Baseline JPEG Baseline JPEG
Encoder Encoder
block diagramblock diagram
Multimedia Systems, Mahdi Amiri, Image IIIPage 10
JPEGColor Space TransformationColor Space Transformation
RGB to YCbCr conversion concept:
The human eye is less sensitive to fine color (chrominance)
details than to fine brightness (luminance) details.
Analog TVAnalog TV
Multimedia Systems, Mahdi Amiri, Image IIIPage 11
Digital TVDigital TV
CbCb = B = B –– YY
Cr = R Cr = R -- YY
JPEG, Chroma SubsamplingSubsamplingSubsampling in in YCbCrYCbCr
Multimedia Systems, Mahdi Amiri, Image IIIPage 12
JPEGBlock Splitting and DCTBlock Splitting and DCT
Block splitting
The image is split into blocks of 8×8 pixels.
Later we discuss why this is done.
Discrete Cosine Transform (DCT)
Multimedia Systems, Mahdi Amiri, Image IIIPage 13
Discrete Cosine Transform (DCT)
Each 8×8 block of each component (Y, Cb, Cr) is
converted to a frequency-domain representation, using
a normalized, two-dimensional type-II discrete cosine
transform (DCT).
JPEG, DCTCenter Around ZeroCenter Around Zero
Multimedia Systems, Mahdi Amiri, Image IIIPage 14
The The 88××8 8 subsub--image shown image shown
in in 88--bit grayscalebit grayscale
JPEG, DCTFourier CoefficientsFourier Coefficients
Multimedia Systems, Mahdi Amiri, Image IIIPage 15
squaresquare--wave synthesized using Fourier cosine coefficients and sine coefficientswave synthesized using Fourier cosine coefficients and sine coefficients
DCTBasis FunctionsBasis Functions
The DCT transforms an 8×8 block of
input values to a linear combination
of these 64 patterns. The patterns are
referred to as the two-dimensional
DCT basis functions, and the output
Multimedia Systems, Mahdi Amiri, Image IIIPage 16
DCT basis functions, and the output
values are referred to as transform coefficients. The horizontal index is u
and the vertical index is v.
The The 88××8 8
subsub--imageimage
JPEG, DCTIllustration of DCTIllustration of DCT
Multimedia Systems, Mahdi Amiri, Image IIIPage 17
JPEG, DCTDCT CoefficientsDCT Coefficients
DC coefficient ( Top-left corner, has large magnitude )
AC coefficients ( Other 63 coefficients )
DCT aggregates most of the signal in one corner
Larger values in the top-left corner
Multimedia Systems, Mahdi Amiri, Image IIIPage 18
DCT coefficient for our sample block (rounded to the nearest two digits beyond the decimal point)DCT coefficient for our sample block (rounded to the nearest two digits beyond the decimal point)
JPEGDCT Coefficients, ExampleDCT Coefficients, Example
Multimedia Systems, Mahdi Amiri, Image IIIPage 19
The result of taking the DCT. The numbers in red are the The result of taking the DCT. The numbers in red are the
coefficients that fall below the specified threshold of coefficients that fall below the specified threshold of 1010..
JPEG, DCTHistograms of DCT CoefficientsHistograms of DCT Coefficients
Histograms of DCT Histograms of DCT
Coefficients of image Coefficients of image
‘‘lenalena’ using blocks of ’ using blocks of
88××8 8 pixelspixels
Multimedia Systems, Mahdi Amiri, Image IIIPage 20
JPEG, QuantizationConceptConcept
The human eye is good at seeing small
differences in brightness over a relatively large
area, but not so good at distinguishing the exact
strength of a high frequency brightness variation.
Small quantization step for low frequency
Multimedia Systems, Mahdi Amiri, Image IIIPage 21
Small quantization step for low frequency
components (Top-left corner in DCT
coefficients matrix )
Big quantization step for high frequency
components (Bottom-right corner in DCT
coefficients matrix )DCT coefficientDCT coefficient
Sample ImagesSample Images
JPEG, QuantizationQuantization MatrixQuantization Matrix
A typical quantization matrix, as specified in the original
JPEG Standard
Multimedia Systems, Mahdi Amiri, Image IIIPage 22
G is the G is the unquantizedunquantized DCT coefficientsDCT coefficients
Q is the quantization matrixQ is the quantization matrix
B is the quantized DCT coefficientsB is the quantized DCT coefficients
JPEG, QuantizationSample OutputSample Output
Multimedia Systems, Mahdi Amiri, Image IIIPage 23
Many of the higher frequency components are rounded
to zero
Quantized DCT coefficient for our sample block Quantized DCT coefficient for our sample block
JPEG, Quantization
Multimedia Systems, Mahdi Amiri, Image IIIPage 24
JPEG, Entropy CodingZigzag OrderingZigzag Ordering
DC Coefficient: DPCM
AC Coefficients
Run-length encoding ( RLE )
Then using Huffman coding
on the whole sequence of numbers
Multimedia Systems, Mahdi Amiri, Image IIIPage 25
on the whole sequence of numbers
JPEGEncoder ExampleEncoder Example
Multimedia Systems, Mahdi Amiri, Image IIIPage 26
JPEGDecoder ExampleDecoder Example
Multimedia Systems, Mahdi Amiri, Image IIIPage 27
JPEGCompression RatioCompression Ratio
Multimedia Systems, Mahdi Amiri, Image IIIPage 28
OriginalOriginal JPEG CompressedJPEG Compressed
Quality setting of Quality setting of 5050
DifferenceDifference
((Darker means a larger
difference)
JPEGMWIPCMWIPC
Multimedia Systems, Mahdi Amiri, Image IIIPage 29
MWIPC, Testing DPCM and DCT based image compressionMWIPC, Testing DPCM and DCT based image compression
JPEGBlocking ArtifactBlocking Artifact
Multimedia Systems, Mahdi Amiri, Image IIIPage 30
OriginalOriginal JPEG CompressedJPEG Compressed
Quality setting of Quality setting of 55
JPEG, Block SplittingBlocks of Blocks of 8 8 by by 8 8 PixelsPixels
Why Blocking?
Neighboring pixels are more correlated
Lower computational complexity
The computational complexity for 2D DCT of an
PaddingIf the data for a channel does not represent
an integer number of blocks then the
encoder must fill the remaining area of the
incomplete blocks with some form of
dummy data.
Multimedia Systems, Mahdi Amiri, Image IIIPage 31
The computational complexity for 2D DCT of an
N by N image is:
, while the complexity of 2D DCT of all N/8 by
N/8 blocks of image is:
( )2
2logO N N
( ) ( )2
2 2
228 log 8
8
NO O N=
What about blocks of What about blocks of 1616××16 16 pixels? pixels?
JPEG, Block SplittingLarger BlocksLarger Blocks
Pro: Less blocking artifact
Con:
Less Correlated data inside the block
Higher computational complexity
Efficiency as a function of block size Efficiency as a function of block size
NN××N, measured for N, measured for 8 8 bit quantization bit quantization
in the original domain and equivalent in the original domain and equivalent
quantization in the transform domain.quantization in the transform domain.
Multimedia Systems, Mahdi Amiri, Image IIIPage 32
Block size Block size 88××8 8 is a good is a good
compromise between coding compromise between coding
efficiency and complexityefficiency and complexity
JPEG, Quantization MatrixQuality FactorQuality Factor
The quality setting of the encoder (for example 50 or 95
on a scale of 0–100 in the Independent JPEG Group's
library) affects to what extent the resolution of each
frequency component is reduced.
For a quality of 100%, the quantization tables should be
Multimedia Systems, Mahdi Amiri, Image IIIPage 33
For a quality of 100%, the quantization tables should be
setup such that all entries are one. For a quality factor of
50%, the ITU/ISO recommended tables are recommended,
but any other choice is also valid. For a quality between
50% and 100%, one may interpolate between the quality
factor given for 50%, and that for 100% (i.e. 1.0)
Thank You
Multimedia SystemsMultimedia Systems
Image III (Compression, JPEG)Image III (Compression, JPEG)
Multimedia Systems, Mahdi Amiri, Image IIIPage 34
Thank You
1. http://ce.sharif.edu/~m_amiri/
2. http://www.dml.ir/
FIND OUT MORE AT...
Next Session: Video INext Session: Video I