View
252
Download
0
Category
Preview:
Citation preview
7/22/2019 DCT Presentation1
1/39
A
Presentation on
Design & Verification of DCT
Algorithm
Guided by:- Submitted by:-
Mr. Preet Jain Atush Jain(Asst. Prof. (0802EC09ME03)
EC Department)
7/22/2019 DCT Presentation1
2/39
Abstract
DCT is abbreviated as DiscreteCosine Transform.
It can be regarded as a discrete time version of the FourierCosine series.
It is technique for converting a signal into elementaryfrequency components.
It is very Common and well known algorithm, used for signaland image compression.
7/22/2019 DCT Presentation1
3/39
Abstract Contd
The DCT Core uses Direct Implementation of algorithm (i.e.
as per the standard equation of DCT)
The DCT Core is implemented using Verilog HDL.
The output of core is then verified from the output of
MATLAB.
7/22/2019 DCT Presentation1
4/39
Discrete Cosine Transform
A discrete cosine transform (DCT) expresses a sequence of
finitely many data points in terms of a sum of cosine functions
oscillating at different frequencies.
The Discrete Cosine Transform (DCT) of a one dimensional
sequence of length N is defined as
where k = 0,1,2,..., N 1
The original signal vector x(n) can be reconstructed back from
the DCT coefficients Y[k] by using the Inverse DCT (IDCT)
operation and it can be defined as
1
0
}2/)12cos{()()()(N
n
NknnxkczY
7/22/2019 DCT Presentation1
5/39
Discrete Cosine Transform
where n = 0,1,2,..., N 1 In both the above equations c[k] is defined as
1
0
}2/)12cos{][][)(N
k
NknkYkcnx
1
0
}2/)12cos{][][)(N
k
NknkYkcnx
1
0
}2/)12cos{][][)(N
k
NknkYkcnx
1
0
}2/)12cos{][][)(N
k
NknkYkcnx
1.........2,1,)/2(
0,)/1(
][ NkN
kN
kc
7/22/2019 DCT Presentation1
6/39
DCT vs DFT
If we wish to find the frequency spectrum of a function thatwe have sampled, the continuous Fourier Transform is not souseful. For that, We need a discrete version like DFT.
When the input data contains only real numbers, the sinecomponent of the DFT is 0, then DFT becomes a DiscreteCosine Transform(DCT).
The Discrete Fourier Transform (DFT) and Discrete CosineTransform (DCT) perform similar functions i.e. they bothdecompose a finite-length discrete-time vector into a sum ofscaled-and-shifted basis functions.
7/22/2019 DCT Presentation1
7/39
DCT vs DFT
The difference between both the transforms is the type of basisfunction used; the DFT uses a set of complex exponentialfunctions, while the DCT uses only (real-valued) cosine
functions.
The DCT & DFT are used because Some tasks are much easierto handle in the frequency domain that in the time domain. Forexample: graphicequalizer. We want to boost the bass:
1. Transform to frequency domain.2. Increase the magnitude of low frequency components.
3. Transform back to time domain.
7/22/2019 DCT Presentation1
8/39
Application of DCT
For audio:
Human ear has different dynamic range for differentfrequencies.
It transform from time domain to frequency domain, and
quantize different frequencies differently.
For images and video:
Human eye is less sensitive to fine detail.
It transform from spatial domain to frequency domain, and
quantize high frequencies more coarsely (or not at all)
Has the effect of slightly blurring the image - may not be
perceptable if done right.
7/22/2019 DCT Presentation1
9/39
Literature Review
Distributed Arithmetic [6,9,10]:-
DCT have been implemented using distributed mechanism.
Most often encountered form of its computation is sum of
product. The product of a pair of matrices can be realized using the DA
when one of the vectors is constant.
(1)
7/22/2019 DCT Presentation1
10/39
Distributed Arithmetic Contd..
Where, Ak is constant
Xk is the input data.
If A1,.AL are all N bits signed 2s complement
binary number, (1) can also be represented as:
(2)
(3)
7/22/2019 DCT Presentation1
11/39
Distributed Arithmetic Contd..
In eq. (2), matrix A is a adder matrix
(4)
but it only consists of two elements: 0and 1. It is easy to
find that Y0, Y1, , YN-1 are the sum of some data fromX1,X2,XL, so the computation of Y only contains two
operations: addition and shift.
7/22/2019 DCT Presentation1
12/39
Distributed Arithmetic Contd..
DA uses a look-up table and accumulators instead ofmultipliers.
Each single bit from each single value of the two multipliedvariables contribute only once to the sum. Because {0, 1} are
the values as discussed earlier, can be restricted to 2^n,therefore they can be pre-calculated and saved in a look-uptable to be retrieved later.
The construction of look up table used by the Distributed
Arithmetic method take large memory size
. The shift operation is implemented by wirings, which costslittle delay and hardware resources.
7/22/2019 DCT Presentation1
13/39
Fast DCT Algorithms
To overcome the extensive computation of the DCT Chen et al
[5, 15], proposed fast DCT (FDCT).
Chen used the Fast Fourier Transform (FFT) method topropose more efficient algorithm involving only real operation
for computing what he called the Fast Discrete Cosine
Transform algorithm (FDCT).
Let, The 8-point DCT can be written as a matrix transform.
Y=AX
7/22/2019 DCT Presentation1
14/39
Fast DCT Algorithms Contd
Where,
The Multiplier coefficients are given by
7/22/2019 DCT Presentation1
15/39
Fast DCT Algorithms Contd
Where,
7/22/2019 DCT Presentation1
16/39
Fast DCT Algorithms Contd
Due to the Symmetry of the (8 X 8) multiplication matrix, it
can be replaced by two (4x4) x (4x4) matrices, which can be
computed in parallel, as can the sums and differences forming
the vectors below
7/22/2019 DCT Presentation1
17/39
Fast DCT Algorithms Contd
The matrices operation of the design was implemented in
terms of a plot for the signal-flow.
7/22/2019 DCT Presentation1
18/39
Fast DCT Algorithms Contd
The Chen fast DCT signal-flow requires total of 18
multiplications.
Lee Algorithm[8, 15]: -
Lee algorithm [8] is based on the matrix representation.
In fact, the first step is nothing than a butterfly decompositionyielding to an even and an odd part.
7/22/2019 DCT Presentation1
19/39
Fast DCT Algorithms Contd
The even part will be just a 1-D DCT of order N/2. While, theodd part will be computed through a matrix multiplication.
For 1-D DCT of order N=8, the number of operationnecessarily for these algorithm will be 32 multiplications and32 additions.
7/22/2019 DCT Presentation1
20/39
Project Design Flow
HDL Flow Matlab Flow
Taking input matrix of size 8 x 1
Apply DCT algorithm designed through HDL on it
Check simulation results for DCT output
A
Taking input matrix of size 8 x 1
Compute DCT through dct command on MATLAB
Store result of above calculated DCT
B
7/22/2019 DCT Presentation1
21/39
Project Design Flow Contd
Comparison b/w HDL & Matlab Results
Compare results of A and B
BA
7/22/2019 DCT Presentation1
22/39
Design of DCT Controller
The equation stated below is the standard equation for the 1d -
DCT
(1)
Where,
Y(u) = Coefficient value in transform domain
X(i) = Coefficient value in pixel domain.u = Co-ordinates in transform domain
i = Spatial co-ordinates in pixel domain
7
0
]16/)12cos[()(}2/)({)(
i
uiixucuy
02/1 u
02/1 u
02/1 u
7/22/2019 DCT Presentation1
23/39
Design of DCT Controller Contd...
(2)
Considering the eq. (1) following eight equations are inferred
Y(0) = [X(0) + X(1) + X(2) + X(3) + X(4) + X(5) + X(6) + X(7)] P (3.1)
Y(1) = [X(0) - X(7)]A + [X(1) - X(6)]B + [X(2) - X(5)]C + [X(3) - X(4)] D
(3.2)
Y(2) = [X(0) - X(3)X(4) + X(7)]M + [X(1) - X(2) - X(5) + X(6)] N (3.3)
Y(3) = [X(0) - X(7)]B + [X(1) - X(6)] (-D) + [X(2) - X(5)] (-A) + [X(3) -
X(4)] (-C) (3.4)
0,1
0,2/1)(
u
uuc
0,1
0,2/1)(
u
uuc
0,1
0,2/1)(
u
uuc
0,10,2/1)(
u
uuc
7/22/2019 DCT Presentation1
24/39
Design of DCT Controller Contd...
Y(4) = [X(0) - X(1) - X(2) + X(3) + X(4) - X(5) - X(6) + X(7)] P (3.5)
Y(5) = [X(0) - X(7)]C + [X(1) - X(6)](-A) + [X(2) - X(5)]D + [X(3) - X(4)]
B (3.6)
Y(6) = [X(0) - X(3)X(4) + X(7)]N + [X(1) - X(2) - X(5) + X(6)] (-M)
(3.7)
Y(7) = [X(0) - X(7)]D + [X(1) - X(6)] (-C) + [X(2) - X(5)] B + [X(3) -
X(4)] (-A) (3.8)
7/22/2019 DCT Presentation1
25/39
Design of DCT Controller Contd...
Where,
M = 0.5 * Cos(pi/8) = 0.5 * Cos (2*pi/16)
N = 0.5 * Cos(3*pi/8) = 0.5 * Cos (6*pi/16)
P = 0.5 * Cos(pi/4) = 0.5 * Cos (4*pi/16)A = 0.5 * Cos(pi/16)
B = 0.5 * Cos(3*pi/16)
C = 0.5 * Cos(5*pi/16)
D = 0.5 * Cos(7*pi/16)
7/22/2019 DCT Presentation1
26/39
Generalized Equation Implementer
BlockInput
(16 bit)
Xin0 Xin7 Xin1 Xin6 Xin2 Xin5 Xin3 Xin4
Add/Sub Block Add/Sub Block Add/Sub Block Add/Sub Block
Multiplier Multiplier Multiplier Multiplier
Adder
Output
7/22/2019 DCT Presentation1
27/39
Add/Sub Block
Adder
Substractor
MuxInput 1
Input 2Output
Sel
7/22/2019 DCT Presentation1
28/39
Multiplier Block
Output of Add/Sub Block
Multiplier
Cos Coefficient
Output
7/22/2019 DCT Presentation1
29/39
DCT Controller Interface
7/22/2019 DCT Presentation1
30/39
RTL View
7/22/2019 DCT Presentation1
31/39
Results
The DCT core is implemented in HDL. It is synthesized and simulated
using Xilinx ISE 9.2i on Spartan 3 (xc3s4000-5fg900)
Synthesis Report:-
S.No. Logic Utilization Used Available Utilization1 Number of Slices 685 27648 2%
2 Number of Slice Flip Flops 788 55296 1%
3 Number of 4 input LUTs 1238 55296 2%
4 Number of bonded IOBs 259 633 40%
7/22/2019 DCT Presentation1
32/39
Advanced HDL Synthesis Report
S.No. Component Used
1 16x16-bit Multiplier 22
2 16-bit Adder 13
3 16-bit Subtractor 15
4 16-bit Register 50
7/22/2019 DCT Presentation1
33/39
Matlab Results
7/22/2019 DCT Presentation1
34/39
VerificationOpen DCT Core Code with Xilinx ISE 9.2i
Go to the process window and double click on synthesis Button
After Successful Synthesis. Create Test Bench
Apply Stimulus to Test Bench
Select Behavioral Simulation from source window
From Process Window run Xilinx ISE Simulator
Simulation will start & generate output
7/22/2019 DCT Presentation1
35/39
Simulation Results
7/22/2019 DCT Presentation1
36/39
Comparison b/w HDL Simulation &
Matlab Results
7/22/2019 DCT Presentation1
37/39
Conclusion
The 1DDCT algorithm code was written in the Verilog HDL. It is then,
synthesized and simulated successfully through Xilinx ISE 9.2i.
Eight 8 x 1 input samples are taken, and DCT is calculated through DCT
core designed in Verilog HDL and the same inputs are used for calculatingDCT using MATLAB.
The latency of implemented core is five clock cycles and through put is one
clock cycle.
Comparison is done between output of Matlab and HDL Simulation Result
which shows that, the accuracy of implemented core is 93.75%.
7/22/2019 DCT Presentation1
38/39
References
1. R.C. Gonzalez, R.E. Woods, DigitalImage Processing,Pearson Education 3rdEdition 2008
2. David Salomon, DataCompression,The Complete Reference, 2nd Edition Springer-Verlag 1998
3. Y. C. Lim, J. B. Evans, and B. Liu, Decomposition of binary integers into signed power-of-two terms,IEEE Trans. Circuits Syst., vol. 38, no. 6, pp. 667-672,1991
4. R. J. Clark, Relationbetween the Karhunen-Lobe and cosine transform,IEEE Proc., vol. 128, pt. F, no.6, pp. 359-360, Nov.1981.
5. W. Chen, C.H.Smith, and S.C.Fralick,A fast computational algorithm for the Discrete Cosine transformIEEE,Trans.Commun.COMM-25, pp.100 1009, Sep.1977.
6. Peng Chungan, Cao Xixin, Yu Dunshan, Zhang Xing, A250MHz optimized distributed architecture of 2D8x8 DCT,7th International Conference on ASIC, pp. 189192, Oct. 2007
7. Bian Li Jian, Zeng Xuan, Tong Jia Rong, Liu Yue, AnEfficient VLSI Architecture for 2D-DCT UsingDirect Method
8. B. G. Lee, Anew algorithm to compute the discrete cosine transform,IEEE Trans. Acoust., speech, SignalProcessing, vol. ASSP-32, pp.12431245, Dec.1984.
7/22/2019 DCT Presentation1
39/39
References Contd
9. Vijay Kumar Sharma, K. K. Mahapatra and Umesh C. Pati, AnEfficient Distributed Arithmetic based VLSIArchitecture for DCT,National Institute of Technology, Rourkela, India-769008
10. Sungwook Yu DCT implementation with Distributed Arithmetic. IEEE Transactions on ComputersVolume 50, Issue 9 September 2001 Pages: 985991, year of Publication: 2001 ISSN: 0018-9340.
11. Trevor W. Fox-2002 Rapid Prototyping of Field Programmable Gate Array- Based Discrete CosineTransform ApproximationsEURASIP JASP 2003, 543- 554.
12. Anthony Edward Nelson Implementationof image processing algorithms on FPGA hardware thesis ofMaster of Science in Electrical Engineering, May 2000 Graduate School of Vanderbilt University.
13. Latha Pillai.video compression using DCT,XILINX Application Note : Virtex- 11 series.XAPP610,v1.4,April 10,2008.
14. K. R. Rao and P. Yip, Discrete Cosine Transform: Algorithms, Advantages and Applications, Academic
Press, Inc., 1990
15. Hassan EL-Banna, Alaa A. EL-Fattah, Waleed Fakhr, AnEfficient Implementation of the 1D DCT usingFPGA Technology,ICM 2003, Dec. 9-1 1, Cairo, Egypt
Recommended