Discrete Class 1

Embed Size (px)

Citation preview

  • 7/28/2019 Discrete Class 1

    1/20

    DISCRETE COSINE TRANSFORMS

    ~ Jennie G. Abraham

    Fall 2009, EE5355

    Reference Book: THE TRANSFORM AND DATA COMPRESSION HANDBOOK,

    edited by K.R. Rao and P.C. Yip

    4.0 Transform Introduction

    In general, there are several characteristics that are desirable for the purpose of data

    compression.Transforms are useful entities that encapsulate these some/all of these characteristics:

    Data decorrelation: The ideal transform completely decorrelates the data in a sequence/block;

    i.e., it packs the most amount of energy in the fewest number of coefficients. In this way, many

    coefficients can be discarded after quantization and prior to encoding. It is important to note

    that the transform operation itself does not achieve any compression. It aims at decorrelating

    the original data and compacting a large fraction of the signal energy into relatively few

    transform coefficients.

    Data-independent basis functions: Owing to the large statistical variations among data, the

    optimum transform usually depends on the data, and finding the basis functions of such

    transform is a computationally intensive task. This is particularly a problem if the data blocks

    are highly nonstationary, which necessitates the use of more than one set of basis functions to

    achieve high decorrelation. Therefore, it is desirable to trade optimum performance for a

    transform whose basis functions are data-independent.

    Fast implementation: The number of operations required for an n-point transform is generally

    of the order O(n2). Some transforms have fast implementations, which reduce the number of

    operations to O(n log n). For a separable n n 2-D transform, performing the row and column

    1-D transforms successively reduces the number of operations from O(n4) to O(2n

    2log n).

    4.1 DCT Introduction

  • 7/28/2019 Discrete Class 1

    2/20

    The discrete cosine transforms (DCT) and discrete sine transform (DST) are members of a family

    of sinusoidal unitary transforms. They are real, orthogonal, and separable with fast algorithms for

    its computation. They have a great relevance to data compression

    Sinusoidal unitary transform: ~ is an invertible linear transform whose kernel describes a set of

    complete, orthogonal discrete cosine and/or sine basis functions.

    E.g.: KLT, generalized DFT, generalized discrete Hartley transform, and various types of

    the DCT and DST are members of this class of unitary transforms.

    The family of discrete trigonometric transforms consists of 8 versions of DCT.

    Each transform is identified as EVEN or ODD and of type I, II, III, and IV.

    All present digital signal and image processing applications (mainly transform coding and

    digital filtering of signals) involve only even types of the DCT and DST.

    Therefore, we consider these four even types of DCT.

    DCT-I Wang and Hunt defined for the orderN+1.

    DCT-II Ahmed, Natarajan, and Rao excellent energy compaction property, bestapproximation for the optimal KLT

    DCT-III Ahmed, Natarajan, and Rao Inverse of DCT-II

    DCT-IV Jain fast implementation of lapped orthogonal transform

    for the efficient transform/subband coding

  • 7/28/2019 Discrete Class 1

    3/20

    4.1.2 Definitions of DCTs

    Note:

    Fornormalized even types of DCT in the matrix form : calculate RHS value for each n and

    k at (n,k)

    N is assumed to be an integer power of 2, i.e., N = 2m

    subscript of matrix denotes its order

    superscript denotes the version number

    4.1.3 Mathematical Properties

    DCT Matrices are real and orthogonal

    Unitary Property

  • 7/28/2019 Discrete Class 1

    4/20

    Linearity Property

    , for a matrix M, constants and , and vectors

    g and f, all DCTs are linear transforms.

    The Convolution-Multiplication Property

    Convolution in the spatial domain is equivalent to taking an inverse transform of the

    product of forward transforms of two data sequences.

    The convolution multiplication property is a powerful tool for performing

    digital filtering in the transform domain.

    All DCTs are separable transforms multidimensional transform can be decomposed

    into successive application of one-dimensional (1-D) transforms in the appropriate

    directions.

  • 7/28/2019 Discrete Class 1

    5/20

    4.3 Relations to the KLT

    KLT is an optimal transform for data compression in a statistical sense because

    it decorrelates a signal in the transform domain,

    packs the most information in a few coefficients, and

    minimizes mean-square error between the reconstructed and original signal compared to

    other transform.

    However, KLT is constructed from the eigenvalues and the corresponding eigenvectors of a

    covariance matrix of the data to be transformed; it is signal-dependent, and there is no

    general algorithm for its fast computation.

    There is asymptotic equivalence of the family of DCTs with respect to KLT for a first-order

    stationary Markov process in terms oftransform size and the adjacent (inter element) correlation

    coefficient .

    The performance of DCTs, particularly important in transform coding, is associated with the KLT.

    For finite length data, DCTs and DSTs provide different approximations to KLT, and the best

    approximating transform varies with the value of correlation coefficient .E.g.:

    KLT is reduced to

    1 DCT-II (DCT-III)

    0 DST-I

    -1 DST-II (DST-III)

    Forinfinite length data i.e. data if the transform size N increases (i.e., N tends to infinity

    KLT is reduced to DCT I or DCT IV

    This asymptotic behavior implies that DCTs and DSTs can be used as substitutes for KLT of

    certain random processes.

    4.4 Relation to DFT[Question (?)]

  • 7/28/2019 Discrete Class 1

    6/20

    DCT is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only

    real numbers. DCTs are equivalent to DFTs of roughly twice the length, operating on real data

    with even symmetry. The obvious distinction between a DCT and a DFT is that the former uses

    only cosine functions, while the latter uses both cosines and sines (in the form of complex

    exponentials).

    Compared with DFT, DCT has two main advantages:

    Its a real transform with better computational efficiency than DFT which by definition is a

    complex transform.

    It does not introduce discontinuity while imposing periodicity in the time signal. In DFT, as

    the time signal is truncated and assumed periodic, discontinuity is introduced in time

    domain and some corresponding artifacts is introduced in frequency domain. But as even

    symmetry is assumed while truncating the time signal, no discontinuity and related artifacts

    are introduced in DCT.

    4.5 Relevance to data compression DCT-IIPerformance of DCT-II is closest to the statistically optimal KLT based on a number of

    performance criteria.

    variance distribution,

    energy packing efficiency,

    residual correlation,

    rate distortion,

    maximum reducible bits

  • 7/28/2019 Discrete Class 1

    7/20

    Exhibition of desirable characteristics for data compression namely,

    o Data decorrelation

    o Data-independent basis functions

    o Fast implementation

    The importance of DCT II is further accentuated by its -

    Superiority in bandwidth compression (redundancy reduction) of a wide range of signals.

    Powerful performance in the bit-rate reduction.

    Existence of fast algorithms for its implementation.

    DCT-II and its inversion, DCT-III, have been employed in the international image/video coding

    standards: e.g.: JPEG, MPEG, H.261, H.263, H.264

    4.6 DCT Computation

    4.6.1 : DCT Definition

    4.6.2 DCT Matrix Form:

  • 7/28/2019 Discrete Class 1

    8/20

    Example of a 4x4 DCT Matrix:

    Example of a 4x4 IDCT Matrix:

    Example: A -point DCT matrix can be generated by

    Assume the signal is , then its DCT transform is:

  • 7/28/2019 Discrete Class 1

    9/20

    The inverse transform is:

    4.6.3 Computation of DCT from DFT (using 2N point FFT):

    To derive the DCT of an N-point real signal sequence , we

    first construct a new sequence of points:

    This 2N-point sequence is assumed to repeat its self outside the range

    , i.e., it is periodic with period , and it is even symmetric with respect

    to the point :

    If we shift the signals to the right by 1/2, or, equivalently, shift to the left by 1/2 by

    defining another index , then is even symmetric with

    respect to . In the following we simply represent this new function by .

  • 7/28/2019 Discrete Class 1

    10/20

    The DFT of this 2N-point even symmetric sequence can be found as:

    Since is even and is odd with respect to , all terms

    in the second summation are odd and the summation is zero (while all terms in the first summation

    are even). It can also be seen that all is real and even . Next, we replace

    by and get

  • 7/28/2019 Discrete Class 1

    11/20

    Note that since all terms in the summation are all even symmectric, only the first half of the data

    points need to be used. Moreover, as cosine function is even, is also even and

    periodic with period , we have

    ,

    indicating that a point ( ) in the second half is the same as its

    corresponding point in the first half, i.e., the second half is redundant and therefore

    can be dropped.

    Now we have the discrete cosine transform (DCT):

    where the nth row and mth column of the DCT matrix:

    All row vectors of this DCT matrix are orthogonal and normalized except the first one ( ):

    It is straightforward to show that a DCT matrix is orthonormal for n even, since the norm ofeach row is unity and the dot product of any pair of rows is zero(the product terms may be

    expressed as the sum of a pair of cosine functions, which are each zero mean).

    To make DCT a orthonormal transform, we define a coefficient

  • 7/28/2019 Discrete Class 1

    12/20

    so that DCT now becomes

    where is modified with , which is also the component in the nth row and mth

    coloum of the N by N cosine transform matrix:

    Here is the ith row of the DCT transform matrix . As these

    row vectors are orthogonal:

    the DCT matrix is orthogonal:

    The inverse DCT is

    or in matrix form:

    4.6.4 DCT Fast Algorithms:

    1. N point DCT via 2N point FFT

    2. N point DCT via N point FFT

  • 7/28/2019 Discrete Class 1

    13/20

    3. Recursive Fast Algorithm

    4. Sparse Matrix Factors

    5. Prime Factor Algorithm for DCT

    6. DIT & DIF Algorithms for DCT

    Fast DCT algorithm

    Forward DCT

    The DCT of a sequence can be implemented by FFT. First

    we define a new sequence :

    Then the DCT of can be written as the following (the coefficient is dropped for now for

    simplicity):

    where the first summation is for all even terms and second all odd terms. We define for the second

    summation , then the limits of the summation and for

    becomes and for , and the second summation can be written as

  • 7/28/2019 Discrete Class 1

    14/20

    where the equal sign is due to the trigonometric identity:

    Now the two summations in the expression of can be combined

    Next, consider the DFT of :

    If we multiply both sides by

    and take the real part of the result (and keep in mind that both and are real), we get:

    The last equal sign is due to the trigonometric identity:

    This expression for is identical to that for above, therefore we get

    where is the DFT of (defined from ) which can be computed using FFT

    algorithm with time complexity .

  • 7/28/2019 Discrete Class 1

    15/20

    In summary, fast forward DCT can be implemented in 3 steps:

    Step 1: Generate a sequence from the given sequence :

    Step 2: Obtain DFT of using FFT. (As is real, is symmetric and

    only half of the data points need be computed.)

    step 3: Obtain DCT from by

    Inverse DCT

    The most obvious way to do inverse DCT is to reverse the order and the mathematical operations

    of the three steps for the forward DCT:

    step 1: Obtain from . In step 3 above there are N equations but 2N variables

    (both real and imaginary parts of ). However, note that as are real, the real

    part of its spectrum is even (N+1 independent variables) and imaginary part odd (N-1

    independent variables). So there are only N variables which can be obtained by solving the

    N equations.

    step 2: Obtain from by inverse DFT also using FFT in complexity.

    Step 3: Obtain from by

  • 7/28/2019 Discrete Class 1

    16/20

    However, there is a more efficient way to do the inverse DCT. Consider first the real part of the

    inverse DFT of the sequence :

    This equation gives the inverse DCT of all even data

    points . To obtain the odd data points, recall

    that , and all odd data points

    can be obtained from the second half of the previous equation in reverse order

    .

    In summary, we have these steps to compute IDCT:

    step 1: Generate a sequence from the given DCT sequence :

    step 2: Obtain from by inverse DFT also using FFT. (Only the real part need

    be computed.)

    Step 3: Obtain from by

    These three steps are mathematically equivalent to the steps of the first method.

  • 7/28/2019 Discrete Class 1

    17/20

    Data Compression

    Although representing images in digital form allows visual information to be easily manipulated in

    useful and novel ways, there is one potential problem with digital imagesthe large number of

    bits required to represent even a single digital image directly. The need for image compression

    becomes apparent when we compute the number of bits per image resulting from typical sampling

    and quantization schemes. We consider the amount of storage for the Lena digital image shown

    in Fig. 4.7.

    The monochrome (grayscale) version of this image with a resolution 512 512 8 bits/pixel

    requires a total of 2,097,152 bits, or equivalently 262,144 bytes. The color version of the same

    image in RGB format (red, green, and blue color bands) with a resolution of 8 bits/color requires a

    total of 6,291,456 bits (=512 512 3 8 bits/pixel), or 786,432 bytes. Such an image should be

    compressed for efficient storage or transmission.

    In order to utilize digital images effectively, specific techniques are needed to reduce the number

    of bits required for their representation. Fortunately, digital images generally contain a significant

    amount of redundancy (spatial, spectral, or temporal redundancy). Image data compression (the

    art/science of efficient coding of the picture data) aims at taking advantage of this redundancy to

    reduce the number of bits required to represent an image. This can result in significantly reducing

    the memory needed for image storage and channel capacity for image transmission.

  • 7/28/2019 Discrete Class 1

    18/20

    Image compression methods can be classified into two fundamental groups: lossless and lossy

    Lossless compression -

    Reconstructed image after compression identical to the original image.

    Modest 1:2 or 1:3 compression ratios are achieved.

    Lossy compression -

    Reconstructed image contains degradations relative to the original.

    Generally, more compression is obtained at the expense of more distortion.

    Transform Coding Compression Scheme: [Question (?)]

    The most used lossy compression technique is transform coding.

    A general transform coding scheme involves subdividing an N N image into smaller

    nonoverlapping n n sub-image blocks and performing a unitary transform on each block. The

    transform operation itself does not achieve any compression. It aims at decorrelating the original

    data and compacting a large fraction of the signal energy into a relatively small set of transformcoefficients (energy packing property). In this way, many coefficients can be discarded after

    quantization and prior to encoding.

    In principle, DCT introduces no loss to the source samples, it merely transforms them to a domain

    in which they can be more efficiently encoded.

  • 7/28/2019 Discrete Class 1

    19/20

  • 7/28/2019 Discrete Class 1

    20/20

    Most practical transform coding systems are based on DCT of types II and III, which

    Provides good compromise between energy packing ability and computational complexity.

    The energy packing property of DCT is superior to that of any other unitary transform.

    Transforms that redistribute or pack the most information into the fewest coefficients

    provide the best sub-image approximations and, consequently, the smallest reconstruction

    errors.

    DCT basis images are fixed (image independent) as opposed to the optimal KLT which is

    data dependent.

    E.g.: DCT-Based Image Compression/Decompression

    Block diagram of encoder and decoder for JPEG DCT-based image compression and

    decompression.