JPEG
Introduction
Video Codec
video
encoder
video
decoder
input
sample
arrays
bitstream reconstr.
sample
arrays
Video EncoderMaps raw sample arrays into a bitstreamDetermines coding efficiency for given bitstream syntax and decoding process
Video DecoderReconstructs sample arrays from the received bitstream
Typical use caseEncoder and decoder know bitstream syntax and decoding process
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 2 / 49
Introduction
Lossy Compression
HD movie UHD broadcast Video chaton Blu-ray disc over DVB-S2 over the Internet
raw video 1920×1080, 24 fps, 3840×2160, 60 fps, 1280×720, 50 fps,data format Y’CbCr 4:2:0, 8 bit Y’CbCr 4:2:0, 10 bit Y’CbCr 4:2:0, 8 bit
raw data rate ca. 600 Mbit/s ca. 7.5 Gbit/s ca. 550 Mbit/s
channel bit rate 36 Mbit/s (read speed) 58 Mbit/s (8PSK 2/3) depends on connection
typicalca. 20 Mbit/s ca. 25 Mbit/s ca. 1 Mbit/svideo bit rate
requiredca. 30 : 1 ca. 300 : 1 ca. 500 : 1compression
Required video bit rate is typically much smaller than raw data rateRequired bit rate can only be achieved with lossy compressionDecoded signal represents approximation of the input signalGoal: Represent signal with small impact on perceived quality
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 3 / 49
Introduction
Coding Efficiency
Coding efficiencyTrade-off between bit rate and reconstruction qualityMeasured for sets of representative test sequences
Bit rateUse average bit rate in our comparisons (count bits in bitstream)In practice: Bit distribution among frames is also important
Reconstruction qualityHuman perception of visual quality: Difficult to evaluateMost reliable method: Subjective viewing tests (expensive, time consuming)Use mean squared error (MSE) and peak signal-to-noise ratio (PSNR)
MSE =1
W · H
W−1∑x=0
H−1∑y=0
(s ′[x , y ]− s[x , y ]
)2PSNR = 10 · log10
(2B − 1)2
MSE(B: bit depth)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 4 / 49
Introduction
Rate-Distortion Curves & Bit-Rate Savings
35
36
37
38
39
40
41
0 1000 2000 3000 4000 5000
PS
NR
(Y
CbC
r) [
dB]
bit rate [kbit/s]
codec B
codec ARB
RA
exampletarget quality
( 39 dB )
0
10
20
30
40
50
60
70
36 37 38 39 40 41
bit-
rate
sav
ing
[%]
PSNR (YCbCr) [dB]
BBA = (RA - RB) / RAtarget quality
( 39 dB )
example
bit rate saving ofcodec B vs. codec A
Rate-distortion curveMeasure average bit rate and average PSNR for multiple operation points
Bit-rate savingsDetermine bit-rate savings using interpolated rate-distortion curves
BBA = (RA − RB ) /RA (codec B relative to codec A)
Calculate average bit-rate savings by integrating curveT. Wiegand (TU Berlin) — Image and Video Coding: JPEG 5 / 49
Intra-Picture Coding
Still Image Coding / Intra-Picture Coding
Still Image CodingExchange, transmission and storage of imagesUsed in virtually all digital cameras and picture editing applicationsMost widely used image coding standard: JPEGFurther standards: JPEG-2000, JPEG-XR, HEVC-Intra
Simple Approach for Video CodingSeparate coding of pictures of a video sequence
Hybrid Video Coding: Two types of block coding modesIntra-picture coding modes
Represent blocks of samples without referring to other picturesUtilize only dependencies inside pictures
Inter-picture coding modesUtilize dependencies between pictures (motion-compensated prediction)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 6 / 49
Intra-Picture Coding
Intra-Picture Coding in Hybrid Video Codecs
Two different settings:
Intra PicturesAll blocks of a picture are coded using intra-picture coding modesFirst picture of a video sequenceRequired for “clean” random access / bitstream splicingCan be advantageous in error-prone environments
Individual Intra BlocksSome blocks of an inter picture are coded in intra-picture coding modesIncreases error robustnessOlder standards: Stops accumulation of transform mismatchesMain reason: Coding efficiency(remember: Non-matched prediction can decrease coding efficiency)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 7 / 49
Intra-Picture Coding
Intra Blocks in Inter Pictures — Coding Efficiency
32
33
34
35
36
37
38
0 2 4 6 8 10
PS
NR
(Y
) [d
B]
bit rate [Mbit/s]
Basketball Drive, IPPP
all coding modes
intra-picture codingmodes are disabledin inter pictures
0
5
10
15
20
25
30
32 33 34 35 36 37 38
bit-
rate
incr
ease
[%]
PSNR (Y) [dB]
Basketball Drive, IPPP
bit-rate increase due todisabling of intra-picture
coding modes in inter pictures(on average, 10.9%)
Example: IPPP coding with H.265 | MPEG-H HEVCDisabling of intra blocks in inter pictures ca. 11% bit-rate increaseCertain regions of a picture cannot be well predicted using MCP
Uncovered backgroundRegions with non-translational motion...
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 8 / 49
JPEG Overview
The JPEG Standard
Joint Photographic Experts Group (JPEG)Standard is named after the group which created itJoint committee of ITU-T and ISO/IEC JTC 1ITU-T Study Group 16, Question 6 (Visual Coding Experts Group: VCEG)ISO/IEC Joint Technical Committee 1, Subcommittee 29, Working Group 1(ISO/IEC JTC 1/SC 29/WG 1)
Standard: Digital Compression and Coding of Continuous-Tone Still ImagesOfficially ITU-T Rec. T.81 and ISO/IEC 10918-1Commonly referred to as JPEGSpecifies compression for gray-level and color imagesWork commences in 1986Standard published in 1992Still most widely used image compression
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 9 / 49
JPEG Overview
JPEG: Color Components and Partitioning
Color components (e.g., Y, Cb, Cr) are coded independently of each otherColor components are partitioned into 8×8 blocks (padding at borders)The 8×8 blocks are coded using transform coding
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 10 / 49
JPEG Overview
JPEG: Block Transform Coding (8× 8 Blocks)
2D
transform
scalar
quantization
entropy
coding
block of
samples
sequence
of bits
2D inverse
transform
decoder
mapping
entropy
decoding
reconstructed
block
2D Block TransformEnergy compaction (reduce dependencies between coefficients)
Scalar QuantizationApproximate signal in a suitable way
Entropy CodingRepresent quantization indexes with a small number of bits
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 11 / 49
2D Block Transform / Orthogonal Block Transform
Orthogonal Block Transform
Linear TransformGeneral case: Samples of a block are arranged in a vector svec
Forward and inverse transforms are given by
tvec = A · svec and s′vec = B · t′vecs′vec is given by weighted sum of transform basis functions (columns of B)
s′vec = t ′0 · b0 + t ′1 · b1 + t ′2 · b2 + · · ·
Perfect ReconstructionPerfect reconstruction in the absence of quantization
=⇒ A = B−1
Orthogonal TransformsTransform basis functions bk are orthogonal to each otherTransform basis functions bk have unit norms (bT
k bk = 1)
=⇒ A = B−1 = BT
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 12 / 49
2D Block Transform / Orthogonal Block Transform
Orthogonal Block Transform
Orthogonal Block TransformsRotation (and reflection) in signal space (axes remain orthogonal)Forward and inverse transforms are given by
tvec = BT · svec and s′vec = B · t′vec
Main AdvantageSSD distortion in signal space = SSD distortion in transform domain
D = (s − s′)T (s − s′) = (t − t′)T BTB (t − t′)= (t − t′)T (t − t′)
=∑k
(tk − t ′k)2 =∑k
(sk − s ′k)2
SSD distortion can be minimized with independent scalar quantizersLagrangian costs D + λ · R can be minimized using simple algorithms
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 13 / 49
2D Block Transform / Separable 2D Block Transforms
Separable 2D Block Transforms
Separable TransformsN×M blocks of samples and transform coefficientsForward and inverse transforms are given by
t = BTV · s · BH and s′ = BV · t′ · BT
H
Interpretation (forward transform)1 Transform all columns of the block using the vertical transform2 Transform all rows of the intermediate block using the horizontal transform
or1 Transform all rows of the block using the horizontal transform2 Transform all columns of the intermediate block using the vertical transform
AdvantageSignificantly reduced complexityPotential loss in coding efficiency is very small (due to 2D character of data)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 14 / 49
2D Block Transform / JPEG: 2D DCT-II
JPEG: Discrete Cosine Transform (DCT) of Type II
Discrete Cosine Transform of type II (DCT-II)Horizontal and vertical transforms BH and BV are DCTs of type IIN×N inverse transform matrix BDCT = {bik} is given by coefficients
bik =ak√N
cos
(π
Nk
(i +
12
))with ak =
{1 : k = 0√2 : k > 0
Each basis vector bk = {bik} represents a sampled cosine function,where the frequency ωk = k π
N increases with increasing k
Why DCT-II?1 Fourier transform of double size applied to signal with mirror extension
Overcomes discontinuity issues of Fourier transform2 Optimal transform for Gauss-Markov sources with %→ 1
Independent transform coefficients: KLT for Gauss-Markov with %→ 13 Fast algorithms for computing forward and inverse transform
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 15 / 49
2D Block Transform / JPEG: 2D DCT-II
Basis Functions of DCT-II (size 8)
b0
b1
b2
b3
b4
b5
b6
b7
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 16 / 49
2D Block Transform / JPEG: 2D DCT-II
JPEG: Separable 2D DCT-II for 8× 8 Blocks
originalblock
horizontalDCT after
horizontalDCT
verticalDCT
after2d DCT
horizontalDCT
verticalDCT
aftervertical
DCT
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 17 / 49
2D Block Transform / JPEG: 2D DCT-II
Basis Images of Separable 8×8 DCT-II (used in JPEG)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 18 / 49
2D Block Transform / Integer Approximation
Integer Approximations of DCT
Disadvantage of DCTMost matrix coefficients are irrational numbersHave to be approximated by binary numbers with finite precisionMismatches if encoder and decoder use different approximationsMismatches can accumulate if prediction between blocks/pictures is used
Transforms in newer Video Coding Standards (H.264 | AVC, H.265 | HEVC)New video coding standards specify integer approximation of DCTSame approximation is used in all implementationsNo encoder/decoder mismatches due to different implementations
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 19 / 49
2D Block Transform / Coding Efficiency
Transform Gain
Coding efficiency of a transformDifficult to evaluate
All components of a transform codec influence each otherFor Gaussian sources, high rates, and entropy-constrained scalar quantizers
Transform gain (ratio of arithmetic and geometric mean of variances)
GT = 10 · log10
(1N
∑k σ
2k(∏
k σ2k
)1/N)
Transform gain GT represents a measure for the decorrelation property /energy compaction property of a transform
Karhunen Loev́e transform (KLT)Orthogonal transform that maximizes transform gain GT
Optimal transform for Gaussian signalsKLT is signal dependent and, for 2D signals, it is a non-separable transform
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 20 / 49
2D Block Transform / Coding Efficiency
Transform Gain of KLT, DCT-II, HEVC Integer Transform
0.0
0.1
0.2
0.3
0.4
Bas BQT Cac Kim Par
Loss
rel
. to
non-
sep.
KLT
[dB
]
Original pictures
17.05
16.08 18.99 23.13
16.56
Separable KLTDCT (type II)HEVC transform
DCT transform gain(in dB) is shownabove the bars
0.0
0.1
0.2
0.3
0.4
Bas BQT Cac Kim Par
Loss
rel
. to
non-
sep.
KLT
[dB
]
Residual pictures
2.87 1.03
2.08 7.382.69
Separable KLTDCT (type II)HEVC transform
DCT transform gain(in dB) is shownabove the bars
Experimental investigation of transform gainBlocks of 8×8 samples (original and residual) for 5 test sequencesRestriction to separable transforms has rather small impactDCT and integer approximation slightly decrease transform gain
Note: Transform gain does not reflect all effects (for non-Gaussian sources)T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 21 / 49
2D Block Transform / Coding Efficiency
Coding Optimal Transform (COT)
Coding Optimal TransformNo straightforward design criterionDesign transform, quantizer, entropy coding using iterative algorithmGiven: Lagrange multiplier λ and sufficiently large training set {sk}
Example algorithm for designing a coding optimal transform1 Choose initial transform (e.g., KLT); given by inverse transform matrix B2 Generate transform coefficient vectors {tk} by transforming all sample
vectors {sk} of the training set using the forward transform BT
3 Develop an ECSQ (using the given λ) for each transform coefficient4 Generate set of reconstructed transform coefficients {t ′k} using the quantizers5 Choose the inverse orthogonal transform matrix B that minimizes the MSE
distortion D between {sk} and {B t ′k}6 Repeat the previous four steps until convergence
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 22 / 49
2D Block Transform / Coding Efficiency
Coding Efficiency of KLT, DCT-II, COT
-0.35
-0.30
-0.25
-0.20
-0.15
-0.10
-0.05
0.00
0.05
0 50 100 150 200 250
Loss
rel
. to
non-
sep.
KLT
[dB
]
bit rate (first-order entropy) [Mbit/s]
Cactus (original pictures)
separable KLT
DCT (type II)COT
-0.10
-0.08
-0.06
-0.04
-0.02
0.00
0.02
0.04
0.06
0 50 100 150 200 250
Loss
rel
. to
non-
sep.
KLT
[dB
]
bit rate (first-order entropy) [Mbit/s]
Cactus (residual pictures)
separable KLT
DCT (type II)
COT
Coding experiment for 8×8 blocks of original and residual picturesEntropy-constrained scalar quantizers & optimal independent entropy codingCompare non-separable KLT, separable KLT, DCT-II, and COT
2D DCT-II represents a reasonable choice for transform codingSignal independent, low complexity, no side informationRather small losses in coding efficiency compared to KLT & COT
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 23 / 49
Scalar Quantization
Scalar Quantization
ConsiderScalar quantization of transform coefficientsSeparate quantization of transform coefficientAssume independent, but optimal entropy coding
Optimal scalar quantizersEntropy-constrained scalar quantizers (ECSQ)ECSQs depend on distribution of transform coefficientsRequire transmission of reconstruction levelsNot used in practical video codecs
Scalar quantizers used in practiceUniform reconstruction quantizers (URQs)URQs with extra-wide dead zone (older video coding standards)Can be characterized by single parameter (step size)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 24 / 49
Scalar Quantization / Distribution of Transform Coefficients
Distribution of Transform Coefficients
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
-30 -20 -10 0 10 20 30
prob
abili
ty d
ensi
ty
transform coefficient value
Transform coefficient t1,1 (residual)
histogram
approximation byLaplacian pdf
(better fit)
approximation byGaussian pdf
0.00
0.05
0.10
0.15
0.20
0.25
0.30
-8 -6 -4 -2 0 2 4 6 8
prob
abili
ty d
ensi
ty
transform coefficient value
Transform coefficient t2,4 (residual)
histogram
approximation byLaplacian pdf
approximation byGaussian pdf(better fit)
Experimental investigation for 8×8 DCTMany coefficients can be well modeled by a Laplacian pdfFor other coefficients, Gaussian model provides better fitGood model: Generalized Gaussian distribution(typically between Laplacian and Gaussian)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 25 / 49
Scalar Quantization / Uniform Reconstruction Quantizers
Uniform Reconstruction Quantizers (URQs)
s
s'0 s'1 s'2 s'3 s'4s'-1s'-2s'-3s'-4
0 Δ 2·Δ 3·Δ 4·Δ-Δ-2·Δ-3·Δ-4·Δ
u-3 u-2 u-1 u0 u1 u2 u3 u4
z3 z2 z1 z0 z0 z1 z2 z3
Uniform reconstruction quantizersEqually spaced reconstruction levels (indicated by step size ∆)Simple decoder mapping
t ′ = ∆ · q
Encoder has freedom to adapt decision thresholds to sourceDecision thresholds can be specified by quantization offsets zk (see figure)Iterative design algorithm similar to that for ECSQs
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 26 / 49
Scalar Quantization / Uniform Reconstruction Quantizers
Coding Efficiency Comparison: URQs vs Optimal ECSQs
0.0000
0.0005
0.0010
0.0015
0.0020
0.0025
0 1 2 3 4 5 6
SN
R lo
ss r
elat
ive
to E
CS
Q [d
B]
rate (entropy) [bit/sample]
URQ (opt.)
URQ2T
URQ1T
0.000
0.005
0.010
0.015
0.020
0.025
0 1 2 3 4 5 6
SN
R lo
ss r
elat
ive
to E
CS
Q [d
B]
rate (entropy) [bit/sample]
URQ (opt.)
URQ3T
URQ2T
URQ1T(maximum at
≈ 0.13327 dB)
Experimental investigation for Laplacian and Gaussian sourcesURQ (opt.) — URQ with optimally selected decision thresholdsURQ1T — URQ with single quantization offset (z0 = z1 = z2 = · · · )URQ2T — URQ with two quantization offsets (z0 and z1 = z2 = · · · )
Restriction to URQs has (typically) very small impact on efficiencyT. Wiegand (TU Berlin) — Image and Video Coding: JPEG 27 / 49
Scalar Quantization / Bit Allocation
Bit Allocation among Transform Coefficients
Optimal bit allocationAll scalar quantizers are designed using the same Lagrange multiplier λ
High-rate approximation for URQsOperational distortion-rate function Dk(Rk) for component quantizers
Dk(Rk) = ε2k · σ2k · 2−2Rk
Optimal bit allocation
λ = −dDk
dRk= 2 ln 2 · ε2k · σ2
k · 2−2Rk = 2 ln 2 · Dk = const
High-rate approximation for distortion
Dk =112·∆2
k
High-rate bit allocation rule
∆k = const
Same quantization step size ∆k for all transform coefficientsT. Wiegand (TU Berlin) — Image and Video Coding: JPEG 28 / 49
Scalar Quantization / Bit Allocation
Coding Experiment: Bit Allocation
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0 50 100 150 200 250
Loss
rel
. to
optim
al E
CS
Q [
dB]
bit rate (first-order entropy) [Mbit/s]
Cactus (residual pictures)
URQ (same λ)
URQ (same Δ)
URQ2T (same Δ)
URQ1T (same Δ)
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0 20 40 60 80 100 120
Loss
rel
. to
optim
al E
CS
Q [
dB]
bit rate (first-order entropy) [Mbit/s]
Kimono (residual pictures)
URQ (same λ)URQ (same Δ)
URQ2T (same Δ)
URQ1T (same Δ)
Experimental investigation for 8×8 residual blocksReference: Optimal ECSQs with optimal bit allocation (same λ)URQ (same λ ) — Optimal URQs, all designed for the same λURQ (same ∆) — Optimal URQs with the same quantization step size ∆
URQXT: Restricted URQ design with X different quantization offsets
Simple and efficient: URQs with the same quantization step sizeT. Wiegand (TU Berlin) — Image and Video Coding: JPEG 29 / 49
Scalar Quantization / JPEG
Quantization in JPEG: Uniform Reconstruction Quantizers
Inverse quantization (scaling) in decoder
t ′ik = ∆ik · qik
with qik : Quantization index for coefficient at location (i , k) inside block∆i,k : Quantization step size for coefficient at location (i , k)t′i,k : Reconstructed transform coefficient at location (i , k)
Quantization (in encoder)No normative encoding procedure, but informative quantization rule
qik = round(
tik∆ik
)(ti,k : original transform coefficient)
Can be improved (discuss later: Rate-distortion optimized quantization)
Quantization weighting matricesSeparate quantization step sizes ∆ik for each coefficient location (i , k)
Additional freedom for encoder optimization (perceptual optimization)T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 30 / 49
Scalar Quantization / JPEG
Quantization Weighting Matrices in JPEG
Quantization Weighting Matrices / Quantization TablesDetermine rate and distortion (among other parameters)Need to be transmitted (no default tables in JPEG)Example tables for YCbCr format are specified in Annex K of standard(empirically derived based on psychovisual threshold experiments)
luma blocks chroma blocks
16 11 10 16 24 40 51 6112 12 14 19 26 58 60 5514 13 16 24 40 57 69 5614 17 22 29 51 87 80 6218 22 37 56 68 109 103 7724 35 55 64 81 104 113 9249 64 78 87 103 121 120 10172 92 95 98 112 100 103 99
17 18 24 47 99 99 99 9918 21 26 66 99 99 99 9924 26 56 99 99 99 99 9947 66 99 99 99 99 99 9999 99 99 99 99 99 99 9999 99 99 99 99 99 99 9999 99 99 99 99 99 99 9999 99 99 99 99 99 99 99
Optimal PSNR: Use same quantization step size for all coefficients
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 31 / 49
Entropy Coding / Statistical Dependencies
Entropy Coding of Transform Coefficient Levels: Scanning
For investigation of transforms and quantizersIgnored potential dependencies between transform coefficient levelsUsed sum of first-order entropies for approximating rate
Statistical dependencies & Scanning patternsTransform coefficient levels are not independent of each other (see later)High-frequency transform coefficient levels are more likely to be equal to zeroScanning: Traverse coefficients from low to high frequency positions
0.242 0.108 0.053 0.009
0.105 0.053 0.022 0.002
0.046 0.017 0.006 0.001
0.009 0.002 0.001 0.000
probabilities P(qk 6= 0) zig-zag scan (JPEG) diagonal scan (HEVC)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 32 / 49
Entropy Coding / Statistical Dependencies
Statistical Dependencies: Transform Coefficient Levels
Are there statistical dependencies?Can compare marginal and conditional pmfs
Infeasible due to very large signal spaceInstead: Evaluate coding methods that utilize potential dependencies
Motivated by approaches found in actual video coding standardsIf levels are independent, no gain will be observed
Investigate coding concepts that exploit potential dependenciesCoded block flag (CBF): Signals whether all levels in block are equal to zeroEnd-of-block flag (EOB): Signals whether all following levels are equal to zero(transmitted at beginning and after each non-zero level)LastPos: Transmit position of last non-zero level in advanceCtxNumSig: Conditional codes depending on number of already codednon-zero levels (in forward and backward scanning order)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 33 / 49
Entropy Coding / Statistical Dependencies
Experiment: Statistical Dependencies
-14
-12
-10
-8
-6
-4
-2
0
0 10 20 30 40 50 60 70 80
bit-
rate
incr
ease
[%]
bit rate (first-order entropy) [Mbit/s]
Cactus (residual pictures)
EOB
LastPos
CtxNumSig(forward)
CtxNumSig(backward)
CBF
-10
-8
-6
-4
-2
0
0 10 20 30 40 50
bit-
rate
incr
ease
[%]
bit rate (first-order entropy) [Mbit/s]
Kimono (residual pictures)
EOBLastPos
CtxNumSig(forward)
CtxNumSig (backward)
CBF
Experimental investigation for 8×8 residual blocks (DCT + optimal URQs)Investigate coding techniques: CBF, EOB, LastPos, CtxNumSigNo actual coding: Calculate entropy limits for the considered technqiuesCompare limits with sum of marginal entropies (limit for independent coding)
There are statistical dependencies between the levels in a blockCan be utilized for an efficient entropy coding
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 34 / 49
Entropy Coding / Run-Level Coding
Entropy Coding Example: Run-Level Coding
Run-Level Coding (e.g., H.262 | MPEG-2 Video)Scan block of transform coefficient levels (e.g., using zig-zag scan)Map scanned sequence of transform coefficients to (run,level) pairs
run : Number of transform coefficient levels equal to zero thatprecede the next non-zero transform coefficient level
level : Value of the next non-zero transform coefficient level
Codewords are assigned to (run,level) pairsCode includes an additional end-of-block symbol (eob)
Signals that all following transform coefficient levels are equal to zero
Example:Scanned sequence of 20 transform coefficient levels
5 −3 0 0 0 1 0 −1 0 0 −1 0 0 0 0 0 0 0 0 0A conversion into run-level pairs (run,level) yields
(0,5) (0,−3) (3,1) (1,−1) (2,−1) (eob)T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 35 / 49
Entropy Coding / JPEG
Entropy Coding in JPEG
Different concepts for DC and AC levelsDC levels: Differential coding using codeword tablesAC levels: Run-level coding of scanned coefficients
Coding of DC transform coefficient levelsDC level is predicted by previous DC levelDifference to predictor is coded using VLC and FLCCategory C is coded using VLC
Specifies range of values
Specifies number of following bits (for FLC)
FLC specifies actual value of DIFF inside categoryIf DIFF > 0, low-order bits of DIFF
If DIFF < 0, low-order bits of DIFF− 1
VLC table for category needs to be transmittedIncreases side information (no default table)
Allows adaptation to actual statisticsT. Wiegand (TU Berlin) — Image and Video Coding: JPEG 36 / 49
Entropy Coding / JPEG
Example VLC Table for Coding DC Difference Category
Example VLC table for coding category C
Range of DIFF values is specified in standardCodeword assignment has to be transmittedExample shows recommended table for luma DC (Annex K of JPEG)
Category C Range of DIFF value Example codeword0 0 001 -1, 1 0102 -3, -2, 2, 3 0113 -7..-4, 4..7 1004 -15..-8, 8..15 1015 -31..-16, 16..31 1106 -63..-32, 32..63 11107 -127..-64, 64..127 111108 -255..-128, 128..255 1111109 -511..-256, 256..511 111111010 -1023..-512, 512..1023 1111111011 -2047..-1024, 1024..2047 111111110
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 37 / 49
Entropy Coding / JPEG
Entropy Coding of AC Transform Coefficient Levels
Representation of AC levelsConvert into sequence using a zig-zag scanAC coefficients are likely to be quantized to zero(in particular those at high-frequency locations)Successive runs of zeros are represented using a run(number of consecutive levels equal to zero)Non-zero levels are represented by a category anda value inside the category (same as for DC levels)
Coding of AC levels: Combination of VLC and FLCVariable-length code table is used for coding events {run,category}VLC table includes a special symbol (EOB) for signaling the end-of-block(all remaining AC levels are equal to zero)Fixed-length code is used for coding the exact value inside a category(number of bits is given by category) – same as for DC difference levelsVLC table has to be transmitted (no default table)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 38 / 49
Entropy Coding / JPEG
Example VLC Table for Run-Category Coding of AC Levels
Example: First entries of recommended table (Annex K of JPEG)
run/category codewordEOB 10100/1 000/2 010/3 1000/4 10110/5 110100/6 11110000/7 111110000/8 11111101100/9 11111111100000100/10 11111111100000111/1 11001/2 110111/3 11110011/4 1111101101/5 111111101101/6 11111111100001001/7 11111111100001011/8 11111111100001101/9 11111111100001111/10 1111111110001000
run/category codeword2/1 111002/2 111110012/3 11111101112/4 1111111101002/5 11111111100010012/6 11111111100010102/7 11111111100010112/8 11111111100011002/9 11111111100011012/10 11111111100011103/1 1110103/2 1111101113/3 1111111101013/4 11111111100011113/5 11111111100100003/6 11111111100100013/7 11111111100100103/8 11111111100100113/9 11111111100101003/10 1111111110010101· · · · · ·
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 39 / 49
Entropy Coding / JPEG
Example: JPEG Transform Coefficient Level Coding
Coding of DC transform coefficient levelLast DC level: DC (N − 1) = 178Prediction difference: DIFF = 185− 178 = 7Category C = 3: Codeword “100”Fixed-length code (lowest 3 bits of “7”): “111”Final bit representation (6 bit): “100111”
Coding of AC transform coefficient levelsZig-zag scanning and conversion into (run,level) pairs yields
(0, 3) (0, 1) (2, 1) (1,−1) (6,−3) (0,−1) (0,−2) (0,−1) (EOB)
Representation as (run,category) [FLC bits] sequence
(0, 2)[11] (0, 1)[1] (2, 1)[1] (1, 1)[0] (6, 2)[00] (0, 1)[0] (0, 2)[01] (0, 1)[0] (EOB)
Bit sequence: VLC bits [FLC bits] (in total: 46 bits for 63 AC levels)
01 [11] 00 [1] 11100 [1] 1100 [0] 111111110110 [00] 00 [0] 01 [01] 00 [0] 1010
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 40 / 49
JPEG Examples
JPEG Compression Example – Original (YCbCr 4:2:0, 12 bpp)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 41 / 49
JPEG Examples
JPEG Compression Example – 1:10 Compression (1.2 bpp)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 42 / 49
JPEG Examples
JPEG Compression Example – 1:25 Compression (0.48 bpp)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 43 / 49
JPEG Examples
JPEG Compression Example – 1:50 Compression (0.24 bpp)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 44 / 49
JPEG Examples
JPEG Compression Example – 1:100 Compression (0.12 bpp)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 45 / 49
JPEG Examples
JPEG Compression Example – 1:200 Compression (0.06 bpp)
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 46 / 49
JPEG Features
JPEG Baseline
JPEG Baseline FeaturesSequential processing of 8×8 blocks (all color components)Transform: Separable 8×8 discrete cosine transform (DCT) of type IIQuantizer: Uniform reconstruction quantizer (using quantization matrix)DC coding: Prediction and combination of VLC and FLCAC coding: Zig-zag scan and run-level coding (combination of VLC & FLC)
Encoding optionsChoose quantization and VLC tablesDetermine transform coefficient levels (rounding / advanced quantization)One-pass encoding:
Predefine tables and choose transform coefficient levels during encoding
Multi-pass encoding: One or more iterations betweenUse given tables and choose transform coefficient levels during encoding
Optimize VLC tables based on given statistics of transform coefficient levels
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 47 / 49
JPEG Features
JPEG – Beyond Baseline
Extended bit depthImages with 1, 2, 3 or 4 color components and 8 or 12 bit per sample
Entropy codingUp to 4 AC and DC entropy coding tables can be specifiedAdaptive binary arithmetic coding can be optionally usedas a replacement for variable-length coding (rarely supported)
Supported modes of operationSequential DCT-based coding (as in Baseline)Progressive DCT-based codingLossless codingHierarchical coding
Extended file formats (not part of JPEG standard)Additional information can be embedded in files (e.g., EXIF):Date/time, camera & lens model, aperture, shutter speed, focal length, etc.
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 48 / 49
Summary
Part Summary
JPEG Baseline: Transform coding of 8×8 blocks of samplesTransform coding usingTransform: Decorrelation / Energy CompactionQuantization: Approximation of signal in suitable wayEntropy Coding: Represent quantization indexes with small number of bits
Transform and QuantizationSeparable DCT-IIUniform reconstruction quantizers (URQs)Support of quantization weighting matrices (side information)
Entropy CodingDifferent approaches for DC and AC coefficientsDC: Prediction + VLC for Category + Fixed-length codingAC: Run-Category coding using VLC + Fixed-length codingTransmission of entropy coding tables as side information
T. Wiegand (TU Berlin) — Image and Video Coding: JPEG 49 / 49