Upload
sprtoshbti
View
401
Download
0
Tags:
Embed Size (px)
DESCRIPTION
GROUP WISE DOWNLOAD OF SPRTOS2011
Citation preview
1
MEMS Based MicroResonator Design & Simulation Based On
Comb-Drive Structure Mr. Prashant Gupta
[email protected] Ideal Institute of Technology, Ghaziabad
Abstract:- Resonators serve as essential components in Radio- Frequency (RF) electronics, forming the backbone of filters and tuned amplifiers. However, traditional solid state or mechanic implementations of resonators and filters tend to be bulky and power hungry, limiting the versatility of communications, guidance, and avionics systems. MicroElectro-Mechanical Systems (MEMS) are promising replacements for traditional RFcircuit components. In this paper we discuss the MEMS resonator, which is one of the versatile components in the RF circuits, based on one of the promising architecture known as Comb-Drive structure. Introduction: A resonator is a device or system that exhibits resonance or resonant behavior, that is, it naturally oscillates at some frequencies, called its resonant frequencies, with greater amplitude than at others. The oscillations in a resonator can be either electromagnetic or mechanical (including acoustic). Resonators are used to either generate waves of specific frequencies or to select specific frequencies from a signal. A physical system can have as many resonant frequencies as it has degrees of freedom; each degree of freedom can vibrate as a harmonic oscillator. Systems with one degree of freedom, such as a mass on a spring, pendulums, balance wheels, and LC tuned circuits have one resonant frequency. Systems with two degrees of freedom, such as coupled pendulums and resonant transformers can have two resonant frequencies. The vibrations in them begin to travel through the coupled harmonic oscillators in waves, from one oscillator to the next. Resonators can be viewed as being made of millions of coupled moving parts (such as atoms). Therefore they can have millions
of resonant frequencies, although only a few may be used in practical resonators. The vibrations inside them travel as waves, at an approximately constant velocity, bouncing back and forth between the sides of the resonator. The oppositely moving waves interfere with each other to create a pattern of standing waves in the resonator. If the distance between the sides is , the length of a round trip is . In order to cause resonance, the phase of a sinusoidal wave after a round trip has to be equal to the initial phase, so the waves will reinforce. So the condition for resonance in a resonator is that the round trip distance, , be equal to an integral number of wavelengths of the wave:
If the velocity of a wave is , the frequency is so the resonance frequencies are:
So the resonant frequencies of resonators, called normal modes, are equally spaced multiples (harmonics), of a lowest frequency called the fundamental frequency. The above analysis assumes the medium inside the resonator is homogeneous, so the waves travel at a constant speed, and that the shape of the resonator is rectilinear. If the resonator is inhomogeneous or has a non rectilinear shape, like a circular drumhead or a cylindrical microwave cavity, the resonant frequencies may not occur at equally spaced multiples of the fundamental frequency. They are then called overtones instead of harmonics. There may be several such series of resonant frequencies in a single resonator, corresponding to different modes of vibration.
2
MEMS Resonators:- Mechanical resonators are highly sensitive probes for physical or chemical parameters which alter their potential or kinetic energy[1,2]. Silicon resonant microsensors for measurement of pressure, acceleration, and vapor concentration have been demonstrated recently, polysilicon micro-mechanical structures have been resonated elcctrostatlcally parallel to the plane of the substrate by means of one or more interdigitated capacitors (electrostatic combs). Some advantages of this approach are (1) less damping on the structure, leading to higher quality factors, (2) linearity of the electrostatic-comb drive and (3) flexibility in the design of the suspension for the resonator For example, folded-beam suspensions can be fabricated without increased process complexity, which is attractive for releasing residual strain and for achieving large-amplitude vibrations. There are different types of resonator. We only focus on vibrating resonators. •Lateral movement –Parallel to substrate –Ex.: Folded beam comb-structure •Vertical movement –Perpendicular to substrate –Ex.: clamped-clamped beam (c-c beam) –”free-free beam”(f-f beam) Example of simple resonators Mass and spring. This resonator is used by many physicists as the elemental simple mechanical resonator, to explain the properties of more complex resonances and resonators. The governing homogeneous differential equation is
vertical displacement y from its equilibrium position, mass m and spring constant k = f / y, R is the damping coefficient. The angular resonant frequency is given by
Folded-Flexure comb drive Microresonator:- In the design of Resonator, spring constant played a vital role. Different types of spring designs have been applied in comb-drive actuators. 1- Clamped–clamped beams, 2-A crab-leg flexure and 3- A folded-beam flexure. In all these different types of spring design, folded beam structure is widely used to design a Microresonator The folded-flexure electrostatic comb drive micromechanical resonator shown in Figure 1 was first introduced by Tang [4, 5,6]. This device has been well-researched and is commonly used for MEMS process characterization. The microresonator consists of a movable central shuttle mass which is suspended by folded-flexure springs on either side. The other ends of the folded-flexure springs are fixed to the lower layer. The microresonator can be thought of, as a spring-mass damper system, the damping being provided by the air below and above the movable part. By applying a voltage across the fixed and movable comb fingers, an electrostatic force is produced which sets the mass into motion in the x-direction. The microresonator has been used in building filters, oscillators and in resonant positioning systems. Figure 1 shows the overhead view of a µresonator which utilizes interdigitated-comb finger transduction in a typical bias and excitation configuration. The resonator consists of a finger-supporting shuttle mass suspended above the substrate by folded flexures, which are anchored to the substrate at two central points. The shuttle mass is free to move in the direction
3
indicated, parallel to the plane of the silicon substrate. Folding the suspending beams as shown provides two main advantages: first, post-fabrication residual stress is relieved if all beams expand or contract by the same amount; and second, spring stiffening nonlinearity in the suspension is reduced, since the folding truss is free to move in a direction perpendicular to the resonator motion. The black areas are the places where the polysilicon structure is anchored to the bottom layer.
Fig.1 Layout of the lateral folded-flexure comb drive microresonator Modeling the Oscillation Modes of the Microresonator:- The preferred direction of motion of the microresonator is the x-direction. However, the microresonator structure can vibrate in other modes. There are the three translation modes along x, y and z, three rotational modes about x, y and z, and oscillation modes due the movement of the folded-flexure beams and the comb drive. Each oscillation mode is described by a lumped second-order equation of motion. For any generalized displacement ζ, we can write:
where Fe,ζ is the external force (in the x-mode this force is generated by the comb drives), rn; is theeffective mass, Bζ is the damping coefficient, and k; is the spring constant. The fundamental frequency of the structure can be obtained from Rayleigh’s Quotient. The fundamental resonance frequency of this mechanical resonator is, again, determined largely by material properties and by geometry, and is given by the expression .
where MP is the shuttle mass, Mt is the mass of the folding trusses, Mb is the total mass of the suspending beams, W and h are the cross-sectional width and thickness, respectively, of the suspending beams, and L is indicated in Fig.1 The expression for the damping coefficient is
where µ is the viscosity of air, d is the fixed spacer gap between the ground plane and the bottom surface of the comb fingers, δ is the penetration depth of airflow above the structure, g is the gap between comb fingers, and As, At, Ab, and Ac are layout areas for the shuttle, truss beams, flexure beams, and comb finger sidewalls, respectively.
4
Working Principle:- To bias and excite the device, a dc-bias voltage VP is applied to the resonator and its underlying ground plane, while an ac excitation voltage is applied to one (or more) drive electrodes. A specific resonance mode may be emphasized by using multiple drive electrodes, placing them at the displacement maxima of the desired mode, and applying properly phased drive signals to the electrodes. To avoid unnecessary notational complexity, however, we focus on the case of fundamental-mode resonance in the present discussion. We also assume that the electrodes are concentrated at the center of the beam and that the beam length is much greater than the electrode lengths. This allows us to neglect beam displacement variations across the lengths of the electrodes due to the beam’s mode shape (i.e., we may assume that x(y) ~ x for y near the center of the beam). A more rigorous analysis which accounts for all of these effects is certainly possible, but obscures the main points. When an ac excitation with frequency close to the fundamental resonance frequency of the µresonator is applied, the µresonator begins to oscillate, creating a time-varying capacitance between the µresonator and the electrodes. Since the dc-bias VPn = VP - Vn is effectively applied across the time-varying capacitance at port n, a motional output current arises at port n. For this resonator design, the transducer capacitors consist of overlap capacitance between the interdigitated shuttle and electrode fingers. As the shuttle moves, these capacitors vary linearly with displacement. Thus, Cn/x is a constant, given approximately by the expression
where Ng is the number of finger gaps, h is the film thickness, and d is the gap between electrode
and resonator fingers. α is a constant that models additional capacitance due to fringing electricfields. For comb geometries, α =1.2 . Note that, again, Cn/x is inversely proportional to the gap distance. Linear equations for the spring constants are derived using energy methods . A force (or moment) is applied to the free end(s) of the spring in the direction of interest, and the displacement is calculated symbolically (as a function of the design variables and the applied force). In these calculations different boundary conditions are applied for the different modes of deformation of the spring. When forces (moments) are applied at the end-points of the flexure, the total energy of deformation, U, is calculated as:
where, Li is the length of the i’th beam in the flexure, Mi is the bending momentransmitted through beam i, E is the Young’s modulus of the material of the beam (polysilicon, in our case) and Ii is the moment of inertia of beam i, about the relevant axis, Ti is the torsion transmitted through beam i, G is the shear modulus, Ji is the torsion constant of beam i, and ξ is the variable along the length of the beam. The bending moment and the torsion is a linear function of the forces and moments applied to the end-points of the flexure. The displacement of an end-point of the flexure in any direction ζ is given as:
where, F ζ is the force applied in that direction at that end-point . Similarly, angular displacements can be related to applied moments. Our aim here is to obtain the displacement in the direction of interest as a function of the applied force in that direction. Applying the boundary conditions, we obtain a set of linear
5
equations in terms of the applied forces and moments and the unknown displacement. Solving the set of equations yields a linear relationship between the displacement and applied force in the direction of interest. The constant of proportionality gives the spring constant as a function of the physical dimensions of the flexure. The effect of spring mass on resonance frequency is incorporated in effective masses for each lateral mode. Effective mass for each mode of interest is calculated by normalizing the total maximum kinetic energy of the spring by the maximum shuttle velocity, Vmax.
where mi and Li are the mass and length of the i’th beam in the flexure. Analytic expressions for velocities, vi, along the flexure’s beams are approximated from static deformation shapes, and are found from the spring constant derivations. Design Variables:- Fifteen design variables are identified for the µresonator. The design variables are listed in Table I and shown in Fig.2 These include 13 geometrical parameters (shown in Fig. 2), the number of fingers in the comb drive, N, and the effective voltage, V, applied to the combdrive.
Fig.2 Dimensions of the microresonator elements. (a) shuttle mass, (b) folded-flexure, comb drive with N movable ’rotor’ fingers, (d) close-up view of comb fingers.
The displacement as a function of the driving voltage was measured while applying a dc voltage between the rotor (movable set) and a stator (stationary set)
Table 1: Design and style variables for the microresonator. Upper and lower bounds are in units of µm except N and V. Quality Factor (Q):- It describes how underdamped an oscillator or resonator. Higher Q indicates a lower rate of energy loss relative to the stored energy.
Where x- x direction m-Mass k-Spring constant B- Damping coefficient.
6
Simulation Process:- Steps for the IntelliSuite Simulator 1-Design the appropriate mask or masks for your design in the IntelliMask 2- Fabricate the device using IntelliFab and visualize it. 3- Perform Different types of Analysis (Static or Frequency) with the help of TEM. 4- Get the results
Fig.3:MEMS microresonator mask structure using IntelliMask
Fig.4:MEMS microresonator process flow using IntelliFab
Fig.5:MEMS microresonator TEM structure using TEM Analysis
7
Fig.6:MEMS microresonator Pressure Distribution
Fig.7:MEMS microresonator charge Distribution
*Capacitance Report Number of conductors: 2 CAPACITANCE MATRIX, 1e-6 nanofarads*1e-6 C11 9.334000 C12 -1.037000 C21 -1.037000 C22 2.767000 *Natural Frequency Report *Unit Hz *Mode Number 6 Mode 1 Frequency 23347.1 (Natural Frequency or resonant Frequency) Mode 2 Frequency 39248.8 Mode 3 Frequency 40138 Mode 4 Frequency 51.6151 Mode 5 Frequency 70.8529 Resonator Simulation Results:- With the help of Simulation process we get the Resonant Frequency with different parameters. We can also find out displacement, pressure distribution, charge distribution, stress, linear motion etc. Figures for pressure distribution and charge distribution are shown in the figure. Comb characteristics Resonant frequency (kHz) S.No. No. Of
Fingers Finger Length (µm)
Finger Width (µm)
Gap (µm)
Calculated Measured
1 12 20 2 2 23.4 22.8 2 12 30 2 2 22.6 22.1 3 12 40 2 2 21.9 22 4 12 50 2 2 21.3 21.2 5 12 40 3 2 20.4 20.3 6 12 40 4 2 19.1 19.1 Table 2: Calculated and measured resonant frequencies of a set of combdrive structures
8
Conclusion and Future Work:- In this project we design and simulate a microresonator based on comb-drive structure which is introduced by Tang. We design it and calculate resonance frequency for different geometry parameters. There are two types of constraints in comb drive structure (1-Geometric and 2-Functional) which we have not discuss here left for the future work. The project work can be extended in a number of directions. Manufacturing variations need to be incorporated for accurate synthesis results. Fabrication for MEMS resonator is also a big issue which we are not discuss in our work and left for the future work. The spring constant can also be designed by different styles also left for future work. After design and calculating the resonance frequency for different shapes we go for simulation process and simulate them and get the results which we shown in the table. From all these work, I would like to conclude some points which are following. To achieve high resonance frequency –Total spring constant should increase –Or dynamic mass should decrease -(Difficult, since a given number of fingers are needed for electrostatic actuation –k and m depend on material choice, layout, dimensions •k expresses the spring constant relative to mass –Frequency can increase by using a material with larger k ratio than Si
Acknowledgements: This research work had been carrying out at CARE, IIT Delhi under the supervision of Prof. Sudhir Chandra CARE, IIT Delhi. I am also grateful to my college Director Dr. G. P. Govil and my Head of the Department Mr. N.P. Gupta for his kind hearted support and motivation during the research work. References:
1. S. M. Sze, Semiconductor Sensors, John Wiley & Sons Inc., New York, 1994
2. Ljubisa Ristic, “Sensor Technology and Devices”, Artech House ISBN 0-89006-532-2, 1994
3. G.K. Fedder and T. Mukherjee, "Automated Optimal Synthesis of Microresonators," Proc 9th Intl. Conf on Solid-State Sensors and Actuators (Transducers ’97), Chicago, IL, June 16-19, 1997.
4. W.C. Tang, T.-C. H. Nguyen, M. W. Judy, and R. T. Howe, "Electrostatic Comb Drive of Lateral Polysilicon Resonators," Sensors and Actuators A, 21 (1990) 328-31.
5. X. Zhang and W. C. Tang, "Viscous Air Damping in Laterally Driven Microresonators," Sensors and Materials, v. 7, no. 6, 1995, pp.415-430.
6. W C Tang, T-C H Nguyen and R T Howe, Laterally driven polysilicon resonant microstructures, IEEE MicroElectro Mechamal System Workshop, Salt Luke City, UT,US A , Feb 20-22, 1989, pp 53-59
7. C.T.C. Nguyen, MTT-S 1999 (http://www.eecs.umich.edu/~ctnguyen/mtt99.pdf)
8. Andrew Potter, “Fabrication and Modeling of Piezoelectric RF MEMS Resonators”, Department of Physics and Division Engineering – Brown University
9. Roger T. Howe, “Applications of Silicon Micromaching to Resonator Fabrication”, 1994 IEEE International Frequency Control Symposium
10. Clark T. C. Nguyen, “ Frequency-Selective MEMS for Miniaturized Communication Devices”, 1998 IEEE Aerospace Conference, vol 1 ,Snowmass, Colorado
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0103-4
Different Look-Ahead Algorithm for Pipelined
Implementation of Recursive Digital Filters Krishna Raj, VivekanandYadav
Abstract—Look-ahead techniques can pipeline IIR digital filters
to attain high sampling rates. The existing Look-ahead such as
CLA and SLA scheme were special cases of the proposed DLA
scheme (new LA scheme) for pipelined implementation of
recursive Digital filters. It can also used to provide equivalent
and stable pipelined implementation with reduced pipeline delay
and hardware when compared with existing Look-Ahead
schemes, comparison between DLA and SLA scheme.
Index Terms-- Clustered look-ahead (CLA) scattered look-ahead
(SLA), Look-Ahead (LA) and Distributed Look-Ahead (DLA).
I INTRODUCTION
ook-ahead techniques have been highly effective in
attaining high sampling rate and computation speed for
low-cost VLSI implementation of recursive digital filters [1,
4, 6, 7, and 9]. There are several LA approaches. One is
referred to as CLA algorithm or time domain approach [4, 6,
7], which clustered the past output data to achieve pipelined
IIR filters. CLA cannot guarantee to be stable, SLA algorithm
or the z-domain approach [1, 8], which uses equally separated
past output data and yields to stable pipelined IIR filters with
linear increasing in hardware. Now, distributed look-ahead
(DLA) algorithm is combines [2] the two above schemes to
reach stable design with reduced Pipeline delay and hardware
complexity.
An M-stage LA pipelined recursive filter can be obtained
by multiplying the numerator and the denominator of the
transfer function by an augmented polynomial, D (z). By
choosing proper order and coefficients of D (z), we obtain
either the M-stage CLA pipelined filter or the M-stage SLA
pipelined filter.
II EXISTING LOOK-AHEAD ALGORITHMS
The transfer function of Nth-order recursive filter is
described by
H (z) = = (1)
The LA algorithm finds the augmented polynomial D (z)
where
Krishna Raj is Deptt. of Electronics Engg., HBTI, Kanpur-208002,
India, Email: [email protected],
Vivekanand yadav, M.Tech., from Deptt. of Electronics Engg.,
HBTI, Kanpur, Email: [email protected].
D (z) = = 1 + (2)
Then the pipelined filter is attained by multiplying D (z) to
both the denominator and numerator in H (z) [10].
= = (3)
For different LA algorithms, the pipelined IIR filter transfer
functions (z) are in different forms. Three existing LA
algorithms are summarized here.
A. Clustered Look-Ahead Algorithm
For the M-stage CLA pipelined IIR filters, the denominator
of the transfer function can be expressed in the form of
(4)
Where M is the pipelined stage, is the coefficient of
pipelined filter. The output data y (n) can be described by the
cluster of N past data y (n-M), y (n-M-1), _, and y (n-M-N+1).
[2]The augmented polynomial coefficients can be found by
iterative calculating as follows...
(5)
Then M-stage pipelining of order recursive filter is
obtained as (4), (6), (7), and (9).
H (z) =
(6)
The total multiplication complexity is (2N+M) and latch
complexity is linear in M. extra delay in producing output is
M [11].
B. SCATTERED LOOK-AHEAD ALGORITHM
For the M-stage SLA pipelined IIR filters, the denominator
of the transfer function is obtained as
(7)
L
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0103-4
The denominator of the resulted transfer function contains
N scattered terms , …, .[3]The coefficients
can be obtained by solve N (M-1) simultaneous equation.
- , where, i=2,…, M-
1,M+1,..,tM-1, tM+1,…..,NM-1.
Then an equivalent M-stage pipelining of same order
recursive filter can be obtained as [1, 8].
H (z)
= (8)
The total multiplication complexity is (NM+N+1) and
latch complexity is square in M. The extra delay in producing
output is (NM-N) [11]. If M is power of 2, then using
decomposition technique, the total multiplication and latch
complexity can be further reduced [1].The architecture is
shown Fig. 1(b).
C. Distributed Look Ahead Pipelining
Pipelining of the following filter transfer function
H (z) =
Since must equal original H (z), can also be obtained
by multiplying. The original filter by an augmentation
polynomial D (z) both in the numerator and the denominator,
i.e.,
Where D (z) = 1+ …………. +
Initialize = -
Iterate For i=2 to (M-1)
According to the Distributed Look-Ahead (DLA)
transformation, the M-stage pipelined filter transfer function
would have the following general form.
(9)
The coefficient of non-recursive portion of pipelined filter
are unequally distributed and it can be implemented with
( ) multiplication and recursive portion by
(L+1) multiplications, hence total multiplications (
and latch complexity is linear in M.CLA and
SLA scheme are special class of DLA scheme. An M-stage
pipelined version of an order Recursive filter is obtained
by substituting in (1) [2, 4, 6, and 7]. Similarly, an M-
stage SLA pipelined version of same order recursive filter
can be produced by substituting in (1) [1, 2, 8].It is
used for high speed modular implementation of stable 2-D
denominator separable IIR filters.
In out In Out In out
(a) (b) (c)
Fig: (1) LA pipelined IIR filters (a) CLA realization (b) SLA realization (c)
DLA realization
III COMPARATIVE ANALYSIS
Table-1
Pipelining
Methods
Multiplication
Complexity
Delay in
First
output
Extra
Delay in
output
CLA L+M+N-1 M M
SLA NM+L NM NM-N
DLA
M+ M
Table-2
IV CONCLUSIONS
The denominator order using DLA , (M + ) is less than the
order with SLA (NM), and the DLA transformed filter is
stable, and then the proposed scheme would offer considerable
hardware savings over SLA. Multiplication and Latch
complexity are less over SLA. Pipeline Delay and hardware
M=3 M=4 M=6 M=8
Method SLA A SLA DLA SLA DLA SLA DLA
No. of
MUL
/adder 6 5 6 5 8 6 8 7
No of Latch 10 8 14 10 22 14 30 18
Delay
in 1st o/p 6 5 8 6 12 8 16 10
MD
DD
D
M
D
D
D
D
M
D
M
D
M
D
M
D
D D
D
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0103-4
complexity are reduced than SLA. From below pole-zero plot
we see that if increase stages more stable the filter. We plot
the graph and we conclude that No. of Multiplier /Adder are
lesser in DLA than SLA because SLA is attained greater value
at each Stage than DLA. No. of Latches are also lesser in DLA
than SLA because in SLA attained values are very large
comparatively to DLA and similarly Delay producing in first
output is also lesser in DLA than SLA.
Examples
H (z) =
(a) (b)
(c) Fig:2 (d)
(a) Fig:3 (b)
(a) Fig:4 (b)
(b)
Fig: 5 (a) DLA (b) SLA
(Using table1 and tabe2)
V REFERENCES
[1] K. K. Parhi and D. G. Messerschmitt, “Pipeline interleaving and
parallelism in recursive digital filters-Part I: Pipelining using
scattered look-ahead and decomposition,” IEEE Trans .on Acoustics,
Speech, and Signal Processing, vol. 37,no. 7 pp. this issue, pp. 1099-
1117,july 1989.
[2] A. K. Shaw and M. Imtiaz, "A general Look-Ahead algorithm for
pipelining IIR filters," in Proc. IEEE ISCAS, 1996, pp. 237-240.
[3] Y. C. Lim, "A new approach for deriving scattered coefficients of
pipelined IIR filters," IEEE Trans. Signal Processing, vol. 43, pp.
2405-2406, 1995.
[4] H.H. Loomis and B Sinha, “High-speed Recursive Digital Filter
Realization”, Circuits, Systems and Signal Processing, vo1.3, pp.
267-294, Sept., 1984.
[5] A. P. Chand, “Low Power CMOS Digital Design,” IEEE J. of
Solid-State Circuits, vol. 27, pp. 473-484, Apr., 1992.
[6] P.M. Kogge, The architecture of Pipelined Computers, New
York, Hemisphere Publishing Corporation, 1981.
[7] Y.C. Lim and B. Liu, “Pipelined Recursive Filter with Minimum
Order Augmentation”, IEEE Transactions on Signal Processing,
vo1.40, no. 7, pp. 1643-1651, July 1992.
[8] M. A. Soderstrand, K. Chopper and B. Sinha, “Comparison of
three new techniques for pipelining IIR digital flters,”23rd
ASILOMAR Conjerenceon Signals, Systems & Computers, Pacific
Grove, CA, pp. 439-443, Nov., 1984.
[9] H. B. Voelcker and E:E. Hartquist, “Digital Filtering via Block
Recursion”, IEEE Trans. Audio Electroacoust., Vol.AU-18, pp.169-
176, June, 1970.
[10] Yen-Liang chen,Chun-Yu chen,Kai-Yuan Jheng and An-
Yen(Andy)Wu,”A Universal Look-Ahead Algorithm For Pipelining
IIR Filters”IEEE Trans,2008.
[11] A. K. Shaw and M. Imtiaz, "New Look-Ahead Algorithm for
Pipelined Implementation of Recursive Digital Filters,” in Proc.
IEEE ISCAS, 1996, pp. 3229-323.
Fig :( 2)
Pole-zero plots for
CLA (a) M=3 (b) M=4 (c) M=5(d) M=6[only (d) stable]
-1 -0.5 0 0.5 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
2
Real Part
Imag
inar
y P
art
-1 -0.5 0 0.5 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
2
Real Part
Imagin
ary
Part
-1 -0.5 0 0.5 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
2
Real Part
Imagin
ary
Part
-1 -0.5 0 0.5 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
2
Real Part
Imagin
ary
Part
-1 -0.5 0 0.5 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
2
Real Part
Imagin
ary
Part
-1 -0.5 0 0.5 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
2
Real Part
Imagin
ary
Part
-1 -0.5 0 0.5 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
3
Real Part
Imagin
ary
Part
-1 -0.5 0 0.5 1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
2
Real Part
Imagin
ary
Part
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0103-4
Fig :( 3) pole-zero plot for SLA (a) M=3 (b) M=4[both stable]
Fig :( 4) pole-zero plot for DLA (a) M=3 (b) M=4[both stable]
1
Study of MC-EZBC and H.264 Video Codec
Agha Asim1 Husain and Agha Imran2 Husain1Deptt of Electronics & Comm. Engg, ITS Engg College, 201301, India
2Deptt of Computer Science & Engg, MRCE, 121004, IndiaEmail: [email protected], [email protected]
Abstract: This paper proposes a new aspect of comparing thetwo video codecs on the basis of rate-distortion basis. Scalablecoding provides a straight forward solution for video codingthat can serve broad range of applications without the need fortranscoding. Even though the latest international video -codingstandards do not provide ful ly scalable methods, only H.264provides the best rate-distortion performance. Other thanH.264, the performance on rate-distortion Motion CompensatedEmbedded Zero Block Context (MC-EZBC) coder which is fullyscalable.
Keywords—, MC-EZBC, ME/MC sub pixel accuracy,temporal level subband coding, YSNR.
I. INTRODUCTION
THE MODERN VIDEO compression coding technologies hasbeen significantly improved for last few years and hasenabled broadcasting of digital video signal over variousnetworks [1]. Also motion compensated wavelet based videocoding emerged as an important research topic to explorebecause of its ability to provide better quality. MC-EZBC [2][3] is one of the codec that encodes the motion information ina non scalable manner, which results in a reduced codingefficiency performance at low bit rates. However H.264 [4] isa non scalable coding technique provides a good qualityvideo at substantially lower bit rates than previous standardslike MPEG-2, H.263, or MPEG-4 Part 2 without increasingthe complexity of design and cost.
In this paper we are performing the analysis on thejoint region of applicability between the MC-EZBC andH.264 video codec. In MC-EZBC, by using a third and fourlevel of temporal decomposition of the input video sequencethereby obtaining a GOP structure of 8 and 16 frames, andeffect of sub-pixel accurate Motion estimation andcompensation, a good comparison with H.264 is achieved interms of Coding Efficiency [5].
The outline of the paper is as follows. After introducingthe examined compression schemes in section II, an overviewof the applied methodology is provided in Section III. Th eobtained results are described in Section IV while theconclusions are drawn in Section V.
II. Video codec overviewThe two video codec that were used in the tests are summedup in this section. Due to place constraints, the reader isreferred to the references for further information on thesecodecs. The first one is a scalable wavelet based video codecdeveloped by J. Woods et al. (motion compensated
embedded zero blocking coding --- MC-EZBC) [6] [7]. Thesecond video codec is the Ad Hoc Model 2.0 (AH M 2.0)implementation of the H.264 standard [4][8] which extendsthe JM 6.1 implementation[9] with a rate controlalgorithm[10].
III. MATERIALS AND METHODS
A. Encoding Process
This section describes how the two codecs were configuredand used in order to obtain the bit streams necessary forperforming the various measurements.
TABLE ISequences Used In Our Experiment
Name No. of frames Abbreviation
Akiyo 300 AK
Foreman 300 FO
Hall 300 HA
As input three progressive video sequences were used inraw Y Cb Cr 4:2:0 formats. These were downloaded from theHannover FTP server.
An overview of the sequences is given i n the Table I. Theresolutions used are the Common Intermediate Format
(CIF, 288352 ), thus resulting in 3 input video sequences.These sequences were encoded by making use of constant bitrate coding (CBR). Ten different target bit rates were used:both very low and very high bit rates. The bit -rates taken are100, 200, 300,…1000 kbps. At each bit rate, encoding wasperformed at 30 frames per second. The detailed settings forthe different encoding parameters can be found in Table IIand Table III.
The code of MC-EZBC was downloaded from the MPEGCVS server. Each input video sequence was encoded onceand then pulled several times in order to get decodable bitstreams for all target bit rates. The H.264 bitstreams areconforming to Baseline and Main Profile. The GOP structureis IBBBP and GOP length is 16.
TABLE IIParameter Settings for the MC-EZBC Compressor
Parameter Value(CIF) Comment
-inname akiyo.yuv Name of input filecontaining a sequence of4:2:0
-statname akiyo_tpyrlev3_cif_mv0.stat
Name of output filecontaining some statisticalinformation generatedduring encoding
-start 0 Index number of the firstframe (0 means first framein file)
-last 299 Index number of the lastframe
-size 352 288 176144
Size of each input frame.1. pixel width of theluminance component2. pixel height of theluminance component3. pixel width of thechrominance component4. pixel height of thechrominance component
-frame rate 30 Number of input framesper second
-tPyrLev 3 Levels of temporalsubband decomposition
-searchrange 16 Maximum search range(in pixels) in firsttemporal decompositionlevel. The search range isdoubled with eachdecomposition
-maxsearchrange 64 Upper limit for searchrange
TABLE IIIParameter Settings for the H.264 AHM 2.0 Encoder
Parameter Value(CIF)
Input FileFrames To Be EncodedSource WidthSource HeightTrace FileRecon FileOutput File
“../Akiyo300_cif.yuv”300352288
“trace_enc.txt”“trace_rec.yuv”
“test.264”Search RangeNumber Reference Frames
161
Restrict Search RangeRD Optimization
21
Context Init Method 1
Rate Control EnableRate Control TypeBit rate
10
100Kbps
B.Quality measurement
The PSNR-Y is calculated as defined in [11]. In order to get aPSNR value for an entire sequence, the average of the PSNR -Y values of the individual frames is calculated. This is notonly one way to get a value for an entire sequence. Butanother method could be, for instance, to take the minimumof the individual PSNR-Y values (because a video sequencemay be evaluated based on the worst part). PSNR is based ona distance between two images [derived from the metric3mean square error (MSE)] and does not take into account anyproperty of the human visual system (HVS).
IV. EXPERIMENTAL RESULTS
In the experiment, the performance of the codec is checkedon Rate-Distortion basis. It is clear that due to the size of theexperiments and place constraints, not all results can bepresented. A subclass of the re sults is given in Table IV andTable V.
The coding efficiency of MC-EZBC is compared withH.264 with different sequences at different bit rates. MC -EZBC is a fully scalable coding architecture whi ch utilizesMCTF and wavelet filtering. The software available fordownload at the website of CIPR, RPI [7] is used for testingof the video material. On the other hand H.264 has nonscalable coding structure and the entire tests were done onLINUX based personal computer (AMD turion 64x2processor speed 1.9GHz and RAM 1GB) with Ubuntu 9.04installed and no other software running in the background.
The measurement results of both codecs can provide anassessment of the coding efficiency of current wavele t-basedcodecs compared to state-of-the-art single-layered codecs. Afirst general remark is the fact that, for certain bit rates, thereare no measurement points for MC -EZBC. MC-EZBC is notable to encode that particular video sequence at such lowtarget bit rates. In case of low bit rates, a codec may alsodecide to skip some frames.
TABLE IV
Average coding gain of MC-EZBC and H.264 between 500- 1000 Kbps
Video Codec Foreman (YSNR) dBMC-EZBC 37.90
H.264 38.06
For video sequences with a higher amount of movement (FO)
indicates that on an Average, H.264 JM 6.0 performs
significantly better than MC-EZBC in terms of PSNR-Y at
almost all bit rates. It is also observed that H.264 outperforms
well throughout the bitrate for High complexity.
TABLE V
Subset of Quality Measurements for Video CIF Sequences
Bit Rate(Kbps)
MC-EZBC H.264
Foreman Sequence Foreman Sequence
100 27.86 30.33400 34.88 35.73
1000 39.12 39.30
IV. CONCLUSION
In this paper, an overview was given of the rate distortionperformance of the two state of the art video codectechnologies in terms of YSNR. From the above results it isclear that the tools that are incorporated in the H.264 standardoutperform MC-EZBC. Although at around 1000 Kbps theperformance of MC-EZBC is comparable with that of H.264for high complexity sequences.
REFERENCES[1] M Ghanbari. Standard Codecs: Image Compression to Advanced video
Coding. IEE Telecommunications Series 2003.
[2] S.S. Tsai, motion Information Scalability for Interframe Wavelet VideoCoding, MS Thesis, National Chiao Tung University, Hs inchu,Tiawan, R.O.C., Jun.2003
[3] S.S. Tsai, motion Information Scalability for Interframe Wavelet VideoCoding, MS Thesis, National Chiao Tung University, Hsinchu,Tiawan, R.O.C., Jun.2003.
[4] J. W. Woods and P. S. Chen, “Improved MC-EZBC with Quarter-pixelMotion Vectors”, ISO/IEC/JTC1SC29/WG11 doc no. m8366, fairfax,May 2002.
[5] T. Wiegand, G Sullivan and A. Luthra. Overview of the H.264/AVCVideo Coding Standard. IEEE Trans. On CSVT, Vol.13, pp.560 -576,July 2003.
[6] I.E.G. Richardson. H.264 and MPEG-$ Video compression. Hoboken,NJ: Wiley, 2003.
[7] S.T. Hsiang and J. W. Woods. “Embedded Video Coding usingInvertible Motion Compensated 3 -D Subband/ wavelet filter Bank.”Signal Process.: Image Communication, vol. 16, pp.705-724, May2001
[8] MC-EZBC Software: www.cipr.rpi.edu/~golwea/mc_ezbc.htm
[8] T. Wiegand, H. Schwarz, A. Joch, F. Kossc ntini, G. J Sullivan,“Rate- Constrained Coder Control and Comparison of Video CodingStandards”, IEEE trans. Circuits sytems, video Technology , vol.13, no.7, pp- 688-703, July 2003.
[9] H.264.AVC Reference Software [Online]. Available:http://iphome.hhi.de/suehring/tml/download/
[10] Proposed Draft Description of Rate Control on JVT standard ,ISO/IECJTC1/SC29/WG11 and ITU-T SG16/Q.6, JVT-documentJVT-F086, Dec. 2002
[11] P. Chen. Fully Scalable Subband / Wavelet Coding. PhD Thesis,Rensselaer Polytechnic Institute, Troy, New York, May 2003.
1
Abstract: Digital filtering technique is implemented using
general purpose digital signal processing chips. Audio and special
purpose digital filtering algorithms are designed on ASICs for
higher bit rate. This paper describes the implementation of IIR
filter algorithms based on field programmable gate arrays
(FPGAs). IIR Filter design shows significant reduction in the
computational complexity required to achieve a given frequency
response as compared to FIR filter for the same response. FPGA
based implementation includes higher sampling rates that are
available in traditional DSP chips. It produces a low cost along
with flexibility in design in comparison to ASIC. It follows
pipeline architecture that gives us the advantages of parallel
processing. We have observed and compared the filtering
characteristics of IIR filter of direct form-2 realization using
MATLAB by altering the bit length and also the order. We have
implemented the digital filter in Xilinx Spartan 3E kit using
VHDL. FPGA architectures are in-system programmable, the
configuration of the device may be changed to implement
different functionality as per requirement. Our work illustrate
that the FPGA approach is both flexible superior to traditional
approaches.
Keywords: ASIC, FPGA, IIR, FIR, VHDL, Pipeline
Architecture, Xilinx Spartan 3E
I. INTRODUCTION
A filter is used to modify an input signal in order to facilitate
further processing. A digital filter works on a digital input (a
sequence of numbers, resulting from sampling and quantizing
an analog signal) and produces a digital output. According to
Dr. U. Meyer-Baese [1], “the most common digital filter is the
Linear Time-Invariant (LTI) filter”. Designing an LTI
involves arriving at the filter coefficients which, in turn,
represents the impulse response of the IIR filter design. These
coefficients, in linear convolution with the input sequence will
result in the desired output. The linear convolution process
can be represented as [2]: The most common approaches to
the implementation of digital filtering algorithms are generally
implemented on digital signal processing chips for audio
applications and application-specific integrated circuits
(ASICs) for higher rates.
This paper describes the way of implementation of IIR digital
filtering algorithm on field programmable gate arrays
(FPGAs).Recent advancements in FPGA technology have
enabled these devices to be applied to a variety of applications
traditionally reserved for ASICs. FPGAs are well suited for
data path designs, such as those encountered in digital filtering
applications. The advantages of the FPGA approach to digital
filter implementation include higher sampling rates than those
are available from traditional DSP chips,[2] lower costs than
an ASIC for moderate volume applications, and more software
flexibility than the alternate approaches. In particular, multiple
multiply-accumulate (MAC) units may be implemented on a
single FPGA, which provides comparable performance to
general-purpose architectures which have a single MAC unit.
In comparison to FIR filter[3] IIR filter uses less MAC unit to
achieve the same frequency response resulting in lesser
memory requirement and less computational complexity for
IIR filter. The configuration of the FPGA device may be
changed to implement alternate filtering operations only by
altering the software, such as lattice filters and gradient-based
adaptive filters, or entirely different. In our project we have
implemented digital IIR filter using FPGA. IIR systems have
an impulse response function that is non-zero over an infinite
length of time. This is in contrast to finite impulse response
(FIR) filters[4], which have fixed-duration impulse responses.
To obtain the similar stability IIR filter requires less order
compared to FIR filter. IIR Filter is one of the Digital Filters
that is used mostly in Audio Signals Processing. One good
application of IIR filter technology is the generation and
recovery of dual tone multi-frequency (DTMF) signals used
by Touch-Tone telephones.
The rest of the paper is organized as follows: Section II
describes related works and Section III deals with proposed
architecture. Our scheme is evaluated by results obtained from
extensive simulation in Section IV. Finally, we conclude in
Section V.
FPGA BASAED IMPLEMENTATION OF
IIR FILTERS
1Anup Saha,
2Saikat Karak,
3Surajit Kangsabanik, and
4Joyita RoyChowdhury
Department of Electronics and Communication Engineering
4th Year, MCKV Institute of Engineering, [email protected],
2
II. RELATED WORKS
Customized VLSI chips influenced the former and most of the
researches implementing digital filter. The architecture of
these filters are largely determined by the target application.
Typical DSP chips like Texas instrument’s TMS320, Free
scale’s MSC81xx, Motorola’s 56000, Analog device’s ADSP-
2100 family efficiently performs filtering operations in audio
range. For higher frequency domain, CMOS and Bi-CMOS
technology is used. There are some disputes in the customized
chips. The biggest shortcoming is low flexibility as they are
application specific. Also, lack of adaptability in these chips is
severe. Typical custom approaches do not allow the function
of a device to be modified during the evaluation, for an
example, fault correction. The FPGA approach is therefore
necessary to provide the designing freedom. Many of the
popular FPGAs are in-system programmable, which allows
modification of the operation using simple programming. But
for filtering purposes FIR[3] filters have been commonly
used. In
this particular work, IIR filters are implemented as they
require fewer calculations and lesser memory requirement.IIR
filters also outperforms FIRs[5] for narrow transition bands.
They can also provide a better approximation for traditionally
analog systems in digital applications than competing filter
types.IIR filters are mainly used in audio applications such as
speakers and sound processing functions. In this work,
XILINX SPARTAN 3E series is used for implementing
various digital filtering algorithms. XILINX SPARTAN 3E
consists of reconfigurable combinational logic blocks with
multi input and output, router or switching matrix for
connection and buffers.
III PROPOSED ARCHITECTURE
IIR filter implementations on FPGA board illustrate that the
FPGA approach is both flexible and provides performance
superior to traditional approaches. Because of the
programmability of this technology, the examples in this paper
can be extended to provide a variety of other high
performance IIR filter realizations. Using powerful computer
based software tools to perform redundant calculations in the
filter design process enables a designer to achieve the best
design within the shortest time. While implementing a filter
on hardware, the biggest challenge is to achieve specified
system performance at minimum hardware cost. In this paper
we achieve this goal by designing the digital filter which also
gives better noise margin and less ageing effect of
components in comparison to Analog filter. One among the
hurdles is to understand, estimate and overcome where
possible, the effects of using a finite word length to represent
the infinite word length coefficients. Selecting a non
optimized word length[6] can result in the filter transfer
function being different from what is expected. The effects of
using finite word length representation can be minimized by
analytical or qualitative methods or simply by choosing to
implement higher order filters in cascaded or parallel form
Digitals filters[7] are often described and implemented in
terms of the difference equation that defines how the output
signal is related to the input signal. We have modeled the
equation as
0 * 0 * *
0
1* 2 * *
1[ ] ( [ ] [ 1] ......... [ ]
[ 1] [ 2] ........... [ ])
p
Q
y n b x n b x n b x n Pa
a y n a y n a y n Q
= + − + −
− − − − − − −
(1)
Where:
• is the feed forward filter order
• are the feed forward filter coefficients
• is the feedback filter order
• are the feedback filter coefficients
• is the input signal
• is the output signal.
Now from the above equation we modeled the transfer
function of IIR filter as
(2)
For hardware representation of the digital filter we have
modeled the transfer function by using adder, multiplier and
delay unit.
Figure 1: Direct Form-2 Structure of Digital Filter
A basic IIR filter consist of 3 main blocks-
(i) Adder (ii) Multiplier (iii) Delay unit
A Implementation of Adder
We have implemented this system using serial adder. A serial
adder is a binary adder that adds the two numbers bit-pair
( )
( )( )
2
2
1
1
2
2
1
10
1 −−
−−
++
++==
zaza
zbzbbzH
zX
zY
+
+
+
+
z-1
z-1
x(n) y(n)w(n)
w(n-1)
w(n-2)
b0
b1
b2-a2
-a1
3
wise. Each bit-pair are added in a single clock pulse. The
carry of each pair is propagated to the next pair.
B. Implementation of Multiplier
The multiplier has been configured to perform multiplication
of signed numbers in two’s complement notation We have
used signed multiplication where a n-bit by n-bit
multiplication takes place and result in a 2*n-bit value.
C. Implementation of Delay Unit
We have used shift register for the purpose of delay. A shift
register is a group of flip-flops set up in a linear fashion with
their inputs and outputs connected together in such a way that
the data is shifted from one device to another when the circuit
is active. (i) A provides the data movement function
(ii). A shift register “shifts” its output once every clock cycle.
IV SIMULATION RESULT
To check the response of proposed filter we have used Filter
Design and Analysis Tool (FDA Tool) which is a graphical
user interface (GUI) available in the Signal Processing
Toolbox of MATLAB for designing and analyzing filters. It
takes the filter specifications as inputs. Table 1 shows the
specifications of an IIR low pass elliptical filter of order 6.
Table 1: IIR filter specifications
Filter performance
parameter
Value
Pass band ripple
Pass band frequency
Stop band frequency
Stop band attenuation
Sampling frequency
0.5dB
11000 Hz
12000 Hz
35 dB
48000 Hz
A. Software Simulation
The sampling frequency is chosen as 4 times the stop band
and the filter has a steep transition band with a width of 1000
Hz. These specifications are fed as inputs to the FDA tool in
MATLAB R2009a. The tool performs the filter design
calculations using double precision floating point numeric
representation and displays the response of a IIR elliptical low
pass filter of order 6. Figure 2 shows the filter design window
of FDA tool, after completion of the design process.
Figure 2 Filter design using MATLAB FDA tool
We have designed the IIR filter of direct form-2 .Using
VHDL we have simulated and downloaded it in Xilinx
Spartan 3E kit. The response we have obtained by simulating
the VHDL code is shown below.
PASS BAND STOP BAND
4
Figure 3 The simulation output of IIR filter
in Xilinx ISE 7.01
The coding scheme that we are using is VHDL (Very high
speed integrated circuit hardware description language). Since
we have designed the filter in digital domain, so to
accommodate it in current existing analog system we have to
add a A/D converter before the system and a D/A converter
after the system.
B. Hardware Implementation
We have implemented digital IIR filter using FPGA based
Xilinx Spartan 3E kit which consists of an interior array of 64-
bit CLBs, surrounded by a ring of 64 input-output interface
blocks. The FPGA architecture is shown below.
Figure 4: Internal Block Diagram of FPGA Architecture
V. CONCLUSION
We have implemented the IIR filter in FPGA and our results
shows better improvement over existing filter design
architecture. In future we will implement our scheme for real
time application.
REFERENCE
[1] U. Meyer-Baese, Digital Signal Processing with Field Programmable Gate
Arrays Second Edition , Springer, p.109.
[2] U. Meyer-Baese, Digital Signal Processing with Field Programmable Gate
Arrays Second Edition , Springer, p.110.
[3] DUSAN M. KODEK, 1980, “Design of Optimal Finite Word length FIR
Digital Filters Using Integer Programming Techniques” IEEE Transactions on
Acoustics, Speech, and Signal Processing, Vol. ASSP-28, No. 3, JUNE 1980.
[4] Wonyong Sung and Ki-Il Kum, 1995, “Simulation-Based Word-Length
Optimization Method for Fixed-point Digital Signal Processing Systems”,
IEEE Transactions on Signal Processing, Vol. 43, No.12, December 1995.
[5] X. Hu, L. S. DeBrunner, and V. DeBrunner, 2002, “An efficient design for
FIR filters with Variable precision”, Proc. 2002 IEEE Int. Symp. on Circuits
and Systems, pp.365-368, vol. 4,May 2002.
[6] Y. C. Lim, R. Yang, D. Li, and J. Song, 1999. “Signed-power-of-two term
allocation scheme for the design of digital filters,” IEEE Transactions on
Circuits and Systems II, vol. 46, pp.577- 584, May 1999.
[7] S. C. Chan, W. Liu, and K. L. Ho, 2001, “Multiplier less perfect
reconstruction modulated filterbanks with sum-of-powers-of-two
coefficients,” IEEE Signal Processing Letters, vol. 8, no. 6,pp. 163-166, June
2001
1
Abstract—GSM acronym for global system for mobile uses
various encryption algorithms as A5/1/ 2/ 3.This is use to encrypt the information when transmit from mobile station to base station during communication. As stated that A5/1 is strong algorithm but it exhibit some weakness as basis on attacks happened on it. In A5/1 attacked on linearity complexity, clocking taps etc. So, in this paper proposed concept to improve A5/1 encryption algorithm to some extend by consideration or improve clocking mechanism of registers present in A5/1 and modified version of A5/1 is fast and easy to implement which make it ideal to future. Index Terms—GSM, encryption, A5/1 stream cipher, clock
controlling unit, correlation
I. INTRODUCTION
In wireless communication technology, wireless
communication is effective and convenient for sharing information[7]. GSM is a very good example of that wireless communication .But this information should be secure means nobody could interfere like eavesdropper. So, to protect our information cryptography play vital role. However, for sending information mobile station to base station there is air interface serious security threat prevention between communicating parties[10]. Then question arise how to protect while communication. For this there is encryption algorithm use in GSM as A5/x series. These algorithms used to encrypt voice and data over GSM link. The various different implementations A5/0 has no encryption, A5/1 is strong version, A5/2 weaker version targeting market outside Europe and at last A5/3 based in block ciphering strong version created as part of 3rd generation partnership project (3GPP)[5].
In this paper we explore about A5/1 that is also strong
version but exhibit weaker due attack happened on it. A5/1 based on stream ciphering[1] that is very fast doing bit by bit XOR and getting result. If we take simple encryption we could perform by take a plaintext bit XOR with any key that keep secret so choose any whatever got that is called cipher text and reverse process is called decryption.
A5/1 made up using linear feedback shift register. Initial value of LFSR is called seed because operation of the register is deterministic stream values produced by registers is
completely determined by its current or previous state. However, LFSR the well chosen feedback function can produce a sequence of bits which appear random and which has long cycle [2]. In cryptography, correlation attacks are a class of known plaintext attacks for breaking stream ciphers whose key stream is generated by combining the output of several linear feedback shift registers using a Boolean function. Correlation attacks [6] exploit a statistical weakness that arises from a poor choice of the Boolean function – it is possible to select a function which avoids correlation attacks, so this type of cipher is inherently insecure. It is simply essential to consider susceptibility to correlation attacks when designing stream ciphers of any type. In this paper proposed a new clocking mechanism for to avoid correlation attack on the place of m-rule i.e. majority rule used by A5/1 stream cipher. Form in different sections as follows. In section 2 description of A5/1 stream cipher is given. In section 3 correlation attack analysis. In section 4 proposed modified structure of A5/1 key stream generator. At last give conclusion.
II. DESCRIPTION OF A5/1
A5/1 is a stream cipher [11] provide key stream so called key stream generator. Made up of three linear feedback shift register of length 19, 22, 23 used to generate sequence of binary bits. GSM conversations are in form of frames as length of 228 bit i.e. 114 for each direction for encrypt/ decrypt data[4]. A5/1 initialize 64 bit key together with 22 frame number publicly known. It used linear feedback shift registers as R1, R2 and R3 to correspondence tap as (13, 16, 17, 18) contained by R1, (20, 21) by R2 and (7, 20, 21, 22) respectively. Each clocked using rule called as majority rule. Clocking tap considered as A, B, C to correspondence registers R1, R2 and R3 as R1 (8), R2 (10) and R310). Before register is clocked feedback is calculated by using linear operator i.e. XOR. The one bit shift to right (discarding the rightmost) bit produced by feedback location store leftmost locations of linear feedback shift registers. This cycle goes up to 64 times. This done on basis of clocking rule that register clocked irregularly according to majority rule. Majority rule uses on three clocking bits of LFSR’s A, B, C. Among clocking bit if one or more is 0, then m=0 whose value match with m that register will clock. Similarly, if one or more
Enhanced Clocking Rule for A5/1 Encryption Algorithm
Rosepreet Kaur Bhogal, ECE Dept., [email protected] , Nikesh Bajaj, Asst. Prof., ECE Dept., [email protected], Lovely Professional University -India
2
clocking bits is 1, then whose values match with m that will clock. At each clocking LFSR generate one bit which combined by linear function. In A5/1, the probability of an individual LFSR being clocked is 3/4. The clocking bit generates bit m defined as using Boolean algebra (A.B (+) B.C (+) A.C) as shown in figure 1 structure of A5/1 stream cipher and possible cases refer to table 1.
Figure 1: Structure of A5/1 stream cipher
Table 1: Possible cases of A5/1 register to clocked
Clocking bit
(A,B,C) Clocking bit
generated using m-rule
Register(s) Clocked
(0,0,0) 0 R1,R2,R3 (0,0,1) 0 R1,R2 (0,1,0) 0 R1,R3 (0,1,1) 1 R2,R3 (1,0,0) 0 R2,R3 (1,0,1) 1 R1,R3 (1,1,0) 1 R1,R2 (1,1,1) 1 R1,R2,R3
As shown in table 1 that possible cases of register to clocked according to m-rule explained. In this each register clocked with probability 3/4 [8] i.e. each output bit of this yield some information about the state of LFSRs [3]. Due to this the whole thing falls to a correlation attack and we determine bits.
III. ANALYSIS CORRELATION ATTACK
Analyzing stream cipher is easier as compare to block cipher. There is two main factor consider while designing any stream cipher that is correlation and linear complexity. Linear complexity is important because Berlekamp messey algorithm
can examine the state of LFSRs mean some of LFSRs bits are related to the output sequence generated. Linear complexity should be longer for more security but does not indicate for secure one. And further correlation immunity, higher linear complexity by combining the output sequence more non linear manner. So, insecurity arises that output of the combining function is correlated with output of individual LFSRs due this correlation attack exist. If observing the output sequence obtains information about internal state output. Using that could determine other internal states by this entire stream cipher generator is broken. Now, come on main point that A5/1 stream cipher is also using three LFSRs and clocking taps look strong but cryptographically weak shown by attacks. In the output of generator equal two output of LFSR2 75% times, if feedback is known, we can determine the initial bit of the LFSR2 and generate output sequence then count number of times LFSR2 output is agrees with output of generator. If two sequences will agree about 50% times then guess wrong if agree 75% then guessed right. Similarly, the output sequence agrees 75% times with LFSR3 using correlation. We could easily cracked by known plaintext attack. It is clear that basic idea behind A5/1 is good it passes statistical test example NIST test [12] but still have weakness that LFSRs length is short enough to made feasible for cryptanalysis. Make A5/1 longer as possible for more security.
IV. MODIFIED A5/1 STREAM CIPHER
The new clock control mechanism is proposed to overcome problem of getting probability of 3/4 explained. By proposed concept probability become 1/2 by using modified clock controlling unit. Consider three bits as A, B and C of respective registers R1, R2 and R3 called as clocking bits .The structure of proposed A5/ 1 stream cipher as shown in figure 2.
Figure 2: Modified stream cipher
3
A. Clocking controlling unit
In the new clock control mechanism each register has one clocking tap in bit 8 for R1, bit 10 for R2 and bit 10 for R3. The clocking bit generated by using Boolean algebra for expression as next write. In this used and gate due to that linear complexity also increase. In the text ¬ is not and (+) is XOR. As that equation given below:
y = ¬ A. (B (+) C) + A. ¬ (B (+) C) (1)
As above expression made by using different gate stated that consider clocking bits A, B and C to respective register. For each cycle register whose clocking tap is agree with y refer equation (1) that register clocked are shifted. For example A,B,C are clocking taps of R1,R2,R3 respectively then table 1 show the all possible combination for clocking.
Table 2: Possible cases modified stream cipher to clock register.
Clocking bit (A,B,C)
Clocking bit generated (y)
Register(s) Clocked
(0,0,0) 0 R1,R2,R3 (0,0,1) 1 R3 (0,1,0) 1 R2 (0,1,1) 0 R1 (1,0,0) 1 R1 (1,0,1) 0 R2 (1,1,0) 0 R3 (1,1,1) 1 R1,R2,R3
As refer table above at each cycle at least one register should clock else it stop that position where it not clocked. Consideration of these problems above mechanism made. Lets case 1: A=0, B=0, C=0 getting result by using equation (1) y=0 so whose register agree with value that clocked like R1, R2, R3 agree so all register clocked shift to right (discarding rightmost bit) .In case 2: A=0, B=0, C=1 using equation (1) y comes as 1 then R3 clocked and shifted. In case 3: A=0, B=1, C=0 using equation (1) y comes as 1 then R2 clocked and shifted. In case 4: A=0, B=1, C=1 using equation (1) y comes as 0 then R1 clocked and shifted. In case 5: A=1, B=0, C=1 using equation (1) y comes as 0 then R2 clocked and shifted. In case 6: A=1, B=0, C=1 using equation (1) y comes as 0 then R2 clocked and shifted. In case 7: A=1, B=1, C=0 using equation (1) y comes as 0 then R3 clocked and shifted. Last In case 8: A=1, B=1, C=1 using equation (1) y comes as 1 then all register clocked and shifted. Note, that if compare the possible outcomes to clock registers in table 1 and 2. In table 1 each cycle at least 2 registers are shifted with 75% probability. This reduced by 50% shown in table 2 where at least one registers shifted. The register bit that got output which is unrelated to state of LFSRs for 6 clock cycles.
V. CONCLUSION
A5/1 key stream generator is easy to implement and also efficient encryption algorithm used in communication application GSM. So, it exhibit weakness like length of LFSRs is short and basic correlation attack discussed in section 3. After analysis these things decreased the possibility of correlation attack. A5/1 modified structure has been given which is easy to implement and fast to do section 4. But if compare clocking mechanism based on majority rule then modified a5/1 stream cipher. However, it has proved that encryption algorithm is insecure based on m-rule. The enhancement proposed in new clock mechanism increase level of security and also decrease the possibility of attack called as correlation attack. As probability of linear feedback shift register clocked was 3/4 reduced up to 1/2. So, it prevents state identified by output sequence i.e. it gave bits which unrelated with output sequence up to 6 cycles. Hence, all shown by modified structure of A5/1 stream cipher in section 3.
ACKNOWLEDGMENT
This is part completion of masters as dissertation. The contribution in assorted ways to do work and the making of the deserved special mention. It is a pleasure to convey my gratitude to them all in my humble acknowledgment. Thanks to guide Mr. Nikesh Bajaj for his supervision, advice, and guidance for the every stage of this paper as well as giving me extraordinary experiences throughout the work. Above all and the most needed, he provided me unflinching encouragement and support in various ways. His truly intuition has made him as a constant oasis of ideas and passions in electronics, which exceptionally inspire and enrich my growth as a student. Last but not the least; I would like to thank my fellow being for the stimulating discussions and successful realization.
REFERENCES
[1] Instant cipher text-only cryptanalysis of GSM encrypted communication, Elad Barkan, Eli Biham, Nathan Keller, Advances in Cryptology – CRYPTO 2003.
[2] On LFSR based stream cipher , analysis and design , Patrik Ekdahl.
[3] A complex linear feedback shift register design for the a5 keystream
generator , Mohmed Sharaf , Hala A.K.Mansour , Hala H.Zayed , M L Shore.
[4] GSM Security and Encryption by David Margrave, George Mason
University.
[5] A Practical-Time Attack on the A5/3 Cryptosystem Used in Third Generation GSM TelephonyOrr Dunkelman, Nathan Keller, and Adi Shamir.
[6] A précis of the new attacks on GSM encryption Greg Rose,
QUALCOMM Australia.
4
[7] Communication security in gsm networks petr bouška, martin drahanský faculty of information technology, brno university of technology.
[8] Enhanced a5/1 cipher with improved linear complexity ,musheer
ahmad and izharuddin.
[9] Mobile networks security,tkl markus peuhkuri ,2008-04-22.
[10] Security enhancements for a5/1 without loosing hardware efficiency in future mobile systems,n. komninos, ‘b. honary, m. Darnel1
[11] Stream Ciphers for GSM Networks,Chi-Chun La and Yu-Jen Chen
Institute of Information Management,National Chiao-Tung University.
[12] http://csrc.nist.gov/groups/ST/toolkit/rng/documentation_software.ht
ml.
Rosepreet Kaur Bhogal pursuing the master’s degree in signal processing from Lovely Professional University, Punjab, India, in 2004. Currently, doing dissertation under the supervision of Mr. Nikesh Bajaj, the assistant professor of electronic department. Research interests include different aspects of cryptography like cryptographic assumptions and encryption algorithms use in GSM etc
Nikesh Bajaj received his bachelor degree in Electronics & Telecommunication from Institute of Electronics And Telecommunication Engineers, and he received his master degree in Communication & Information System from Aligarh Muslim University, India. Now, he is working in LPU as Asst. Professor, Department of ECE. Research interests include Cryptography, Cryptanalysis, and Signal
& Image Processing.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0107-1
An Application of Kalman Filter in State Estimation of a
Dynamic System
Vishal Awasthi1, Krishna Raj
2
Abstract-- Most wireless communication systems for indoor
positioning and tracking may suffer from different error
sources, including process errors and measurement errors.
State estimation algorithm deals with recovering some desired
state variables of a dynamic system from available noisy
measurements. A correct and accurate state estimation of
linear or non-linear system can be improved by selecting the
proper estimation technique. Kalman filter algorithms are
often used technique that provides linear, unbiased and
minimum variance estimates of an unknown state vectors for
non-linear systems. In this paper we tried to bridge the gap
between the Kalman Filter and its variant i.e. Extended
Kalman Filter (EKF) with their algorithm and performance in
the state estimation of the car moving with a constant force.
Index Terms-- Stochastic filtering, Bayesian filtering,
Adaptive filter, Unscented transform, Digital filters.
1. INTRODUCTION
In the area of telecommunications, signals are the mixtures of
different Frequencies. Least squares method proposed by Carl
Friedrich Gauss in 1795 was the first method for forming an
optimal estimate from noisy data, and it provides an important
connection between the experimental and theoretical sciences.
Before Kalman, In 1940s, Norbert Wiener proposed his
famous filter called Wiener filter which was restricted only to
stationary scalar signals and noises. The solution obtained by
this filter is not recursive and needs the storing of the entire
pas observed data. Early 1960s, Kalman filtering theory, a
novel recursive filtering algorithm, was developed by Kalman
and Bucy which did not require the stationarity assumption
[1], [2]. Kalman filter is a generalization of Wiener filter. The
significance of this filter is in its ability to accommodate
vector signals and noises which may be non stationary. The
solution is recursive in that each update estimate of the state is
computed from the previous estimate and the new input data,
so, contrary to Wiener filter, only the previous estimate
requires storage, so Kalman filter eliminate the need for
storing the entire past observed data. Most of the existing
approaches need a priori kinematics model of the target for the
prediction. Although this predictor can successfully filter out
the noisy measurement, its parameters might be changed due
to different dynamic targets.
1Member IETE, Lecturer, Deptt. of Electronics & Comm. Engg., UIET., CSJM.University, Kanpur-24, U.P., (email: [email protected]) 2Fellow IETE, Associate Professor, Deptt. of Electronics Engineering,
H.B.T.I., Kanpur-24, U.P., (email: [email protected])
Information is usually obtained in the form of measurements
and the measurements are related to the position of the object
that can be formulated by Bayesian filtering theory. Since
Kalman filter theory is only applicable for linear systems and
in practice almost all practical dynamic systems (relation
between the state and the measurements) are nonlinear. The
most celebrated and widely used nonlinear filtering algorithm
is the extended Kalman filter (EKF), which is essentially a
suboptimal nonlinear filter. The key idea of the EKF is using
the linearized dynamic model to calculate the covariance and
gain matrices of the filter. The Kalman filter (KF) and the
EKF are all widely used in many engineering areas, such as
aerospace, chemical and mechanical engineering. However, it
is well known that both the KF and EKF are not robust against
modelling uncertainties and disturbances.
Kalman filtering is an optimal algorithm, widely applied in the
forecasting of system dynamic and estimating an unknown
state. Measurement devices are constructed in such a manner
that the output data signals must be proportional to certain
variables of interest. Knowledge of the probability density
function of the state conditioned on all available measurement
data provides the most complete possible description of the
state but except in the linear Gaussian case, it is extremely
difficult to determine this density function [6]. To enhance
these concepts, several algorithms were proposed using
parametric and non-parametric techniques such as Extended
Kalman Filter (EKF), Unscented Kalman filter (UKF)
respectively.
Unscented transformation (UT) is an elegant way to compute
the mean and covariance accurately up to the second order
(third for Gaussian prior) of the Taylor series expansion. Low-
order statistics of a random variable undergoes a non-linear
transformation y = g(x) and generate and propagate sigma
points through the nonlinear transformation-
Yi = g(X ) i , i = 0,……,2zx (1)
Where zx is the dimension of x. Scaling parameters are used to
control the distance between the sigma points and the mean .
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0107-2
In the presence of a random disturbances (white noise) or
when few system parameters change, the use of an adaptive
and optimal controller turns out necessary [3], [4]. In this
paper we are choosing to use Kalman filter as a controller.
This technique is based on the theory of Kalman's filtering [5],
it transforms Kalman's filter into a Kalman controller.
Simulation results show that the state estimation performance
provided by the robust Kalman filter is higher than that
provided by the EKF.
Recently, results on some new types of linear uncertain
discrete-time systems have also been given. Yang, Wang and
Hung presented a design approach of a robust Kalman filter
for linear discrete time-varying systems with multiplicative
noises [7]. Since the covariance matrices of the noises cannot
be known precisely, Dong and You derived a finite-horizon
robust Kalman filter for linear time-varying systems with
norm-bounded uncertainties in the state matrix, the output
matrix and the covariance matrices of noises [8]. Based on the
techniques Zhu, Soh and Xie gave a robust Kalman filter
design approach for the linear discretetime systems with
measurement delay and norm-bounded uncertainty in the state
matrix [9]. Hounkpevi and Yaz proposed a robust Kalman
filter for linear discrete-time systems with sensor failures and
norm-bounded uncertainty in the state matrix [10].
Currently many systems successfully using the Kalman filter
algorithms in different diverse areas such as the processing of
signals in mobile robot, GPS position based on neural network
[11], aerospace tracking [12], [13], underwater sonar and the
statistical control of quality.
In this paper the state of the car has been estimated through
Kalman filter and Extended Kalman filter which is moving
with a constant force. Dynamic model of the system is very
much nonlinear and hence firstly we linearized the nonlinear
system equations using EKF algorithm, secondly we perform
the time domain analysis of the dynamic model using
sampling time 10 millisec.
2. TECHNOLOGICAL DEVELOPMENT OF KALMAN
FILTER
A stochastic process is a family of random variables indexed
by the parameter and defined on a common probability
space. Bayesian models are a general probabilistic approach
for estimating an unknown probability density function
recursively over time using incoming measurements and a
mathematical process model [14].
The Kalman filter is an optimal observer in the sense that it
produces unbiased and minimum variance estimates of the
states of the system i.e. the expected value of the error
between the filter’s estimate and the true state of the system is
zero and the expected value of the squared error between the
real and estimated states is minimum.
2.1 WEINER FILTER
Weiner was as a pioneer in the study of stochastic and noise
processes [15] who proposed a class of optimum discrete time
filters during the 1940s and published in 1949. Its purpose is
to reduce the amount of noise present in a signal by
comparison with an estimation of the desired noiseless signal.
The Wiener process (often called as Brownian motion) is one
of the best known continuous-time stochastic process with
stationary statistical independence increments. The Wiener
filter uses the mean squared error as a cost function and
steepest-descent algorithm for recursively updating the
weights.
The main problem with this algorithm is the requirement of
known input vector correlation matrix and cross correlation
vector between the input and the desired response and
unfortunately both are unknown.
2.2 DISCRET KALMAN FILTER
A state estimate is represented by a probability density
functions (pdf) and the description of full pdf is required for
the optimal (Bayesian) solution but the form of pdf is not
restricted and hence it can’t be represented using finite number
of parameter [14], [16]. To solve this problem R.E. Kalman
designed an optimal state estimator for linear estimation of the
dynamic systems using state space concept [17], that has the
ability to adapt itself to non-stationary environments. It
supports estimations of past, present, and even future states,
and it can do so even when the precise nature of the modeled
system is unknown. A set of mathematical equations provides
an efficient computational (recursive) means to estimate the
state of a process, in such a way that minimizes the mean of
the squared error.
The filter is very powerful in several aspects:
The Kalman filter is an efficient recursive filter
algorithm that estimates the state of a dynamic system
from a series of noisy measurements and hence the filter
can be viewed as a sequential minimum mean square
error (MSE) estimator with additive noise.
It works like an adaptive low-pass infinite impulse
response (IIR) digital filter, with cut-off frequency
depending on the ratio between the process- and
measurement (or observation) noise, as well as the
estimate covariance.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0107-3
The Kalman filter is a set of mathematical equations that
provides an efficient computational (recursive) means to
estimate the state of a process, in such a way that
minimizes the mean of the squared error.
2.2.1 DYNAMIC SYSTEM MODEL OF KALMAN
CONTROLLER
The Kalman filter is used for estimating or predicting the next
stage of a system based on a moving average of measurements
driven by white noise, which is completely unpredictable. It
needs a model of the relationship between inputs and outputs
to provide feedback signals but it can follow changes in noise
statistics quite well. The Kalman filter is an optimum
estimator that estimates the state of a linear system developing
dynamically through time.
Kalman filter theory is based on a state-space approach in
which a state equation models the dynamics of the signal
generation process and an observation equation models the
noisy and distorted observation signal. For a signal and
noisy observation , equations describing the state process
model and the observation model are defined as:
… (2)
… (3)
where, is the P-dimensional signal vector, or the state
parameter, at time k, M is a P × P dimensional state transition
matrix that relates the states of the process at times k –1 and k,
E is the control-input model which is applied to the control
vector uk , Jk (process noise) is the P-dimensional uncorrelated
input excitation vector of the state equation. Jk is assumed to
be a normal (Gaussian) process p(Jk)~N(0, Q), Q being the P ×
P covariance matrix of J(k) or process noise covariance. is
the M dimensional noisy observation vector, h is a M × P
dimensional matrix which relates the observation to the state
vector. is the M-dimensional noise vector, also known as
measurement noise, is assumed to have a normal
distribution p( )~N(0, R)) and R is the M ×M covariance
matrix of (measurement noise covariance).
2.2.2 KALMAN FILTER ALGORITHM
Initially the process state is estimated at some time and then
obtains feedback in the form of (noisy) measurements. The
equations for the Kalman filter fall into two groups:
Time Update (Predictor) Equations: which are
responsible for projecting forward (in time) the current
state and error covariance estimates to obtain the a priori
estimates for the next time step.
Measurement Update (Corrector) Equations: which are
responsible for the feedback i.e. for incorporating a new
measurement into the a priori estimate to obtain an
improved a posteriori estimate.
Sawaragi et al. [18] examined some design methods of
Kalman filters with uncertainties and observed that under poor
observabilty and numerical instability Kalman filters do not
work properly.
2.2.3 FLOW CHART OF TIME & MEASUREMENT
UPDATE ALGORITHM
Time Update:
Measurement Update:
Initialize the state estimate
Take the Initial measurement sample
at k instant i.e.
Update state estimate with new
measurement
Calculate the state estimate to
next sample time i.e.
Update the sample of new
sample time i.e.
Initialize error covariance
Compute The Kalman Gain
Update the Error Covariance
Update the sample of new sample
time i.e.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0107-4
Figure 1. Recursive Updation Procedure for Discrete Kalman Filter
2.3 EXTENDED KALMAN FILTER (EKF)
The extended Kalman filter (EKF) is the nonlinear version of
the Kalman filter that linearizes the non-linear measurement
and state update functions at the prior mean of the current time
step and the posterior mean of the previous time step,
respectively.
2.3.1 EXTENDED KALMAN FILTER ALGORITHM
Time Update:
(1) Project the state ahead :
… (4)
(2) Project the error covariance ahead:
… (5)
The time update equations project the state and covariance
estimates from the previous time step k-1 to the current time
step k.
Measurement Update:
(1) Compute the Kalman gain:
… (6)
(2) Update estimate with measurement zk :
… (7)
(3) Update the error covariance:
… (8) The measurement update equations correct the state and
covariance estimates with the measurement . An important
feature of the EKF is to propagate or “magnify” only the
relevant component of the measurement information.
2.3.2 LIMITATIONS OF EKF ALGORITHM
Although the EKF is computationally efficient recursive
update form of the Kalman filter still it suffers a number of
serious limitations [14]:
(1) Linearized transformations are only reliable if the error
propagation is well approximated by a linear function. If this
condition does not hold, then the linearized approximation
would be extremely poor and hence it causes its estimates to
diverge altogether.
(2) The EKF does not guarantee unbiased estimates and also
calculate error covariance matrices that do not necessarily
represents the true error covariance.
3. PROBLEM DESCRIPTION
We consider a dynamic system i.e. a car with a constant force
moving with a constant acceleration and follow a linear/ non-
linear motion. To estimate the state i.e. position, the
continuous time state space model is discretised with a 10
millisec sampling time.
3.1 MATHEMATICAL MODELING OF SYSTEM
In a dynamic system, the values of the output signals depend
on both the the past behavior of the system and also on
instantaneous values of its input signals. The output value at a
given time t can be computed using the measured values of
output at previous two time instants and the input value at a
previous time instant.
Figure 2. Free body diagram of car-model
Horizontal and Vertical motion is govern by the following
equations:
(9)
(10)
(11)
(12)
(13)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0107-5
0 10 20 30 40 50 60 70 80 90 100-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
Time (sec)C
ar
positio
n
true position
measured position
estimated position
0 10 20 30 40 50 60 70 80 90 100
-0.1
-0.05
0
0.05
0.1
Time (sec)
err
or
difference between true position and measured position
difference between true position and estimated position
(14) For steady state analysis, considering has very small value in
between -10 to 10 radians.
(15)
(16)
Figure 2 illustrates the modeled characteristics of the car. The
front and rear suspension are modeled as spring/damper
systems. This model include damper nonlinearities such as
velocity-dependent damping. The vehicle body has pitch and
bounce degrees of freedom. They are represented in the model
by four states: vertical displacement, vertical velocity, pitch
angular displacement, and pitch angular velocity. The front
suspension influences the bounce (i.e. vertical degree of
freedom).
Dynamic model of the system is very much nonlinear and
hence firstly we linearized the nonlinear system through EKF
algorithm.
4. SIMULATION RESULTS
The mean and covariance of the posterior distribution were
recorded at each time step and compared to the true estimate.
For comparison, the data was also processed with EKF. Figure
shows the mean error of different filters it can be seen that
EKF works quite well and optimal for linear measurements
regardless of the density function of the error. The mean errors
did not vary much between different filters. However, EKF
performed quite well even with large blunder probabilities. A
comparative chart is given below to demonstrate the Error in
estimating the state through KF and EKF.
TABLE I
Comparative Chart of State (Position) Values with Kalman Filter
Time
(sec)
True
state
(mt)
Measured
state (mt)
Estimated
state
(mt)
Error (true -
measured
position)
(mt)
Error (true -
estimated
position)
(mt)
1 0.0125 0.0223 0.0011 -0.0098 0.0114
30 0.0221 0.0213 0.024 0.0008 -0.0019
60 0.0746 0.0712 0.0743 0.0034 0.0003
90 0.1567 0.1751 0.1712 -0.0184 -0.0145
100 0.1988 0.1824 0.2113 0.0164 -0.0125
Figure 3. Comparison of True, Measured & Estimated position with KF
Figure 4. Comparison of Error between true, measured & estimated position value with KF
TABLE II
Comparative Chart of State (Position) Values with Extended Kalman Filter
Time
(sec)
True
state
(mt)
Measured
state (mt)
Estimated
state
(mt)
Error (true -
measured
position)
(mt)
Error (true -
estimated
position)
(mt)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0107-6
0 10 20 30 40 50 60 70 80 90 100-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
Time (sec)
Car
positio
n
true position
measured position
estimated position
0 10 20 30 40 50 60 70 80 90 100
-0.1
-0.05
0
0.05
0.1
Time (sec)
err
or
difference between true position and measured position
difference between true position and estimated position
1 0.0012 0.0181 0.0010 -0.0169 0.0002
30 0.0186 0.0251 0.022 -0.0065 -0.0034
60 0.746 0.744 0.731 0.0020 0.015
90 0.147 0.189 0.148 -0.042 -0.0010
100 0.1791 0.181 0.183 -0.0019 -0.0039
Figure 5. Comparison of True, Measured & Estimated position with EKF
Figure 6. Comparison of Error between true, measured & estimated position
value with EKF
5. CONCLUSION
In this paper, a detailed overview of Kalman filter and
Extended Kalman Filter to improve inadequate statistical
models, nonlinearities in the measurement is presented.
Simulation results show that the performance of the Extended
Kalman filter is higher than that of the Kalman filter and
conclude that the Kalman filter-based scheme is capable of
effectively estimating the position errors of moving target to
make future state and measurement predictions more accurate
and therefore improving the accuracy of target positioning and
tracking. Further efforts in kalman filter will lead to improved
estimation of signal arrival time and more accurate target
positioning and tracking.
This work can be used as theoretical base for further studies in
a number of different directions such as tracking system, to
achieve high computational speed for multi-dimensional state
estimation.
REFERENCES
[1] Kalman, R. E.,” A new approach to linear filtering and prediction
problems”, Journal of Basic Engineering Transactions of the
ASME, Series D, Vol. 82, No. 1, pp. 35-45, 0021- 9223, 1960. [2] Kalman, R. E. & Bucy R. S.,” New results in linear filtering and
prediction problems”, Journal of Basic Engineering Transactions
of the ASME, Series D, Vol. 83, No. 3, pp. 95- 108, 0021-9223, 1961.
[3] Mudi Rajani, K. & Nikhil Pal, R. ,”A robust self-tuning scheme for
PI and PD type fuzzy controllers”, IEEE transactions on fuzzy systems, Vol. 7, No. 1, ( February 1999) 2-16, 1999.
[4] Zdzislaw, B. ,”Modern control theory”, Springer-Verlag Berlin Heidelberg, 2005.
[5] Eubank, R. L.,”A Kalman filter primer”, Taylor & Francis Group, 2006.
[6] D. L. Alspach, and H. W. Sorenson, “Nonlinear Baysian
estimation using Gaussian sum approximations,” IEEE Trans. Automatic Cont., vol. 17, no. 4, pp. 439-448, Aug. 1972.
[7] Yang, F.; Wang, Z. & Hung, Y. S.,” Robust Kalman filtering for
discrete time-varying uncertain systems with multiplicative noises”, IEEE Transactions on Automatic Control, Vol. 47, No. 7,
pp.1179-1183, 0018-9286, 2002.
[8] Dong, Z. & You, Z. ,” Finite-horizon robust Kalman filtering for
discrete time-varying systems with uncertain-covariance white
noises”, IEEE Signal Processing Letters, Vol.13, No. 8, pp. 493-496, 1070-9908, 2006.
[9] Zhu, X.; Soh, Y. C. & Xie, L,” Design and analysis of discete-time
robust Kalman filters. Automatica”, Vol. 38, pp. 1069-1077, 0005-1098, 2002.
[10] Hounkpevi, F. O. & Yaz, E. E.,” Robust minimum variance linear
state estimators for multiple sensors with different failure rates”, Automatica, Vol. 43, pp. 1274-1280, 0005-1098, 2007.
[11] Wei Wu and Wei Min, “The mobile robot GPS position based on
neural network adaptive Kalman filter”, International Conference on Computational Intelligence and Natural Computing, IEEE, pp.
26-29, 2009
[12] Y. Bar-Shalom and Li X.R., Estimation and Tracking: Principles, Techniques, and Software, Artech House, 1993.
[13] Y. Bar Shalom, X.-R. Li, and T. Kirubarajan, Estimation With
Applications to Tracking and Navigation. New York: Wiley, 2001.
[14] Y. C. Ho and R. C. K. Lee, “A Bayesian approach to problems in
stochastic estimation and control,” IEEE Trans. Automatic Cont.,
vol. AC-9, pp. 333-339, Oct. 1964. [15] P. Maybeck, Stochastic Models, Estimation and Control. New
York: Academic Press, vol. I, 1979.
[16] S. Haykin, Adaptive Filter Theory. Prentice-Hall, Inc., 1996. [17] H. J. Kushner, “Approximations to optimal nonlinear filters,” IEEE
Trans. Automatic Cont., AC-12(5), pp. 546-556, Oct. 1967.
[18] Sawaragi, Yoshikazu and Katayama, Tohru, “Performance Loss And Design Method of Kalman Filters For Discrete-time Linear
Systems With Uncertainties”, International Journal of Control,
12:1, 163 — 172, 1970.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0107-7
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-1
Wideband Direction of Arrival Estimation by using
Minimum Variance and Robust Maximum Likelihood Steered
Beamformers: A Review
SANDEEP SANTOSH
1, O.P.SAHU
2, MONIKA AGGARWAL
3
Astt. Prof., Department of Electronics and Communication Engineering1
,
Associate Prof., Department of Electronics and Communication Engineering2
,
National Institute of Technology , Kurukshetra, 1,2
Associate Prof., Centre For Applied Research in Electronics (CARE)3 ,
Indian Institute of Technology, New Delhi.3
INDIA
[email protected] http://www.nitkkr.ac.in
Abstract
Beamforming of sensor arrays is a
fundamental operation in Sonar, Radar
and telecommunications. The Minimum
Variance Steered Beamformer and
Robust Maximum Likelihood Steered
Beamformer are two important methods
for Wideband Direction of Arrival
Estimation. This research paper presents
a comparative study between Minimum
Variance Steered Beamformer and
Robust Maximum likelihood Steered
Beamformer . MV beamformers can
place nulls in the array response in the
direction of unwanted sources even if
located within a beamwidth from the
source of interest provided that the
interfering signals are uncorrelated with
the desired one. A steered wideband
adaptive beamformer optimized by novel
concentrated maximum likelihood (ML)
criterion in the frequency domain can
be considered and this ML beamforming
can reduce the typical cancellation
problems encountered by adaptive MV
beamforming and preserve the
intelligibility of a wideband and colour
source signal under interference,
reverberation and propagation
mismatches. The Minimum Variance
Steered Beamformer (MV-STBF) and the
use of Steered Covariance Matrix is
illustrated and the Robustness of the
Maximum Likelihood Steered
Beamformer (ML-STBF) by using a
Modified Newton Algorithm is
explained.
Key-Words :- Wideband Direction of
Arrival(DOA) Estimation, Minimum
Variance , Robust Maximum Likelihood,
Steered Beamforming, Covariance
Matrix.
1. Introduction
Beamforming of sensor arrays is a
fundamental operation in Sonar, Radar
and telecommunications. The
development of minimum
variance(MV) adaptive beamforming
has taken in the last three decades . The
bad effect of multipath on MV
beamforming is the cancellation of the
desired signal even if coherent
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-2
component is very weak at the output of
the generalized sidelobe canceller. The
classical cure for this phenomenon lies
in the definition of a set of linear or
quadratic constraints on adaptive part of
the beamformer, based on proper
modeling of the array perturbations .
Wideband arrays are less sensitive to
signal cancellation because reflections
exhibit a delay of several sampling
periods with respect to direct (useful)
path.
Prefiltering of the array outputs and
proper constraints on the weight vector
helps in contrasting cancellation . It is
not clear if the MV criterion is optimal
in ensuring best possible reconstruction
of a wideband signal of interest e.g
intelligibility in case of speech.
Therefore , there is a need of frequency
domain wideband beamformer, starting
from the concepts of focusing matrices
and steered beamforming that aligns the
component of the direct path signal
along the same steering vector as in a
narrowband array.
This beamformer uses a single set of
weights for the entire bandwidth but the
adaptation is made on the basis of
concentrated maximum likelihood (ML)
cost function derived by using a
stochastic Gaussian assumption on the
frequency samples of the beamformer
outputs .
It is found that the ML solution does not
depend on any prefiltering applied to
array outputs provided that none of the
subband components are nulled out.
Nonconvexity of the derived ML cost
function makes it unsuitable for classical
Newton optimization .
Hence , a second order algorithm is
developed starting from a procedure
originally introduced for fast neural
network training which recasts the ML
problem as an iterative least square
minimization . The robust ML wideband
beamformer also incorporates a norm
constraint to reduce the risk of signal
cancellation under propagation
uncertainities. [1],[2],[3],[4]
2 . Narrowband and Wideband MV
Beamforming
An array with M sensors receives the
signal of interest s(t) radiated by a point
source ,whose position is characterized
by generic coordinate vector p. The
propagating medium and the sensors are
assumed linear even if they may
introduce temporal dispersion on s(t).
The direct or the shortest path of the
wave propagation is characterized by (M
Χ 1) vector of impulse responses hd(t,
p),starting from t=td. Multiple delay and
filtered copies of s(t) generated by
multipath ,reverberation and scattering
are also received by the array and can be
globally modeled by the vector (M X 1)
vector hr(t,p) impulse responses starting
from t =tr > td. Interference and noise are
statistically independent of s(t) and are
conveniently collected in the (M X 1)
vector v(t). Therefore, the (M X 1) array
output vector or snapshot x(t) obeys the
continuous time equation,
x(t) = ∫td ∞ hd(т,p) s( t - т)dт +
∫tr∞hr(т,p)s(t-т)dт+v(t). (1)
This model represents a large number of
real world environments, encountered in
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-3
telecommunications, remote sensing ,
underwater acoustics, seismics and
closed room applications . The objective
of beamforming is to recover s(t) in the
presence of multipath, noise and
interference terms given the knowledge
of direct path response only. In fact hd(t,
p) is accurately described by analytical
and numerical methods or measured
under controlled conditions but hr(t,p)
depends on great number of
unpredictable and time-varying factors
.An alternative view is to consider a
reference model containing only the
terms related to direct path and to
develop robust algorithms that are able
to bound in a statistical sense the effects
of sufficiently small perturbations on the
final estimate.
2.1 Discrete Time Signal Model
Array outputs x(t) are properly
converted to the baseband , sampled and
digitized. Under general assumption
equation (1) is written in discrete time as
the vector FIR convolution as,
x(n) = Σk=Nd1 Nd2
hd(k,p) s(n-k) +
Σk=Nr1Nr2
hr(k,p)s(n-k)+v(n) (2)
The relationships among discrete-time
transfer functions of (2) and their analog
counterparts in (1) are quite involved
depending on the receiver architecture.
In many cases, the delays of reflection
with respect to direct path exceed the
Nyquist sampling period of the baseband
signal (i.e Nr1 > Nd2 so that hd(n,p) and
hr(n,p) do not overlap.
2.1.1. Narrowband MV Beamforming
The narrowband arrays obey (2) with
known hd(n,p) = hd(p)u0(n-Nd), an
unknown hr(n,p) = hr (p) u0(n-Nd), and
Nd1= Nd2 = Nr1 = Nr2. Hence, we have,
x(n)=[hd(p)+hr(p)]s(n-Nd)+v(n). (3)
The sources are assumed white within
the sensor bandwidth. Reflection delays
must be less than the sampling period so
that the spectrum at the sensor outputs
remains white. The sequence s(n) is
conveniently scaled so that |hd(p) | = 1.
A( M X 1) complex valued weight
vector w is applied to the baseband
snapshot x(n) to recover a spatially
filtered signal y(n,w) where
y(n,w)=wHx(n) (4)
according to some optimality criteria.
For example, in the classical MV-DR
beamformer, w^ solves the LS
minimization problem as,
w^=argwmin[Σn=1N|y(n,w)|
2] (5)
subject to hd(p)Hw = 1 using N
independent
snapshots. The output is finally
computed as
y(n,w^1)=y0(n)+w^1Hy1(n)
2 (6)
2.1.2 Wideband MV Beamforming
The extensions of MV beamforming to
wideband arrays have been proposed
several times in the past following
either time-domain or frequency domain
approaches . The main drawback of
wideband MV beamforming is the high
number of free parameters to adapt
which produces slower convergence
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-4
,high sensitivity to mismodeling and
strong misadjustment for short
observation times . The introduction of
large number of linear or quadratic
constraints may enlarge these issues at
the expense of a reduced capability of
suppressing interference . The very
complex and largely unpredicatable
structure of reverberant fields can make
it impossible to specify an effective set
of constraints.
2.1.3 Steered Adaptive Beamformer
An interesting tradeoff between
complexity and efficacy is obtained by
the Wideband Steered adaptive
beamformer (STBF) which was
introduced on the basis of coherent
focusing technique ,Coherent Signal
Subspace Processing (CSSM) developed
by Wang and Kaveh . In the frequency
domain formulation of the STBF , the
sequence x(n) (n= 1,2,….,N) is
partitioned into L nonoverlapping
blocks of length J that are separately
processed by a windowed Fast Fourier
Transform (FFT).Finally the frequency
domain output is computed as ,
y(ωj,l,w1)=y0(ωj,l)+w1Hy1(ωj,l). (7)
Where y0(ωj,l ) and y1(ωj,l ) can be
computed.[1],[2],[3],[10]
3 . Limitations of MV-STBF
It is known that most wideband signals
of interest are strongly correlated in
time .Effects of temporal correlation of
s(n) on MV-STBF are two fold i.e.
a) The impulse responses of the
direct path and reflections are
often well separated in time in
wideband environments.
However, the signal replicas may
still cancel the desired signal s(n-
N0) if multipath delay does not
extend the correlation time of
s(n).
b) It is not obvious anymore that the
MV criterion leads itself to good
solution in wideband scenarios.
When the beamformer is a
steered off-source , it mostly
captures background noise which
often is considered temporally
white .In this case the MV
criterion realizes a particular ML
estimator i.e when the
beamformer points towards a
correlated source, the quality of
the output is influenced by the
spectra of the source itself and of
the inteference.
These two problems are strictly
related .An optimal cost function
should impose a trade off on
performance at different frequencies
when a single weight vector w1 is
used for the entire bandwidth. A
wideband beamformer aiming to
preserve the signal intelligibility
should minimize the noise plus
interference power in those subbands
where the useful source spectrum has
valleys. The ineference nulling
becoms less important near the
spectral peaks of the useful signal
whose strength may be adequate to
mask the unwanted
components.[1],[2],[3],[10]
4. Maximum Likelihood STBF
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-5
In order to overcome the drawbacks
of MV beamforming, a general
stochastic model can be formulated
and exploited to derive the proper
ML estimator of w, subject to the
given constraint CHw = d. Using the
reduced space formulation, this
constrained ML problem can still be
converted into unconstrained
maximization of the likelihood
function of the beamformer output
containing the useful signal plus
noise and interference residuals.
Although, the crucial assumption for
validity of this model is that the
multipath terms are uncorrelated
with the direct path , it can be shown
that the resulting ML estimator is
also effective in decorrelating the
multipath terms having a delay
higher than one sampling period,
independently of the source
spectrum.
In particular , for the central limit
theorem,y(ωj,l,w1) can be considered
to be independent ,zero-mean
,circular Gaussian random variable
regardless of the original distribution
of the signal and interference but
characterized by a different variance
ζj2 in each subband. In reverbrent
fields and in presence of coloured
sources ,such as speech and sonar
targets, these conditions can be
further approached by proper
prewhitening of highly correlated
components that are present in both
y0(n) and y1(n). The scaled global
negative log-likelihood of the STBF
output can be written as ,
L(w1)=Σj=j1j2
[log(ζj2)+
(Σl=1L|y(ωj,l,w1)|
2/Lζj
2)] (8)
After neglecting irrelevant additive
constants, the Optimal weights w1 are
finally found as,
Lc(w1)=Σj=j1j2
log[1/L(Σl=1L|y(ωj,l,)|
2] (9)
w1ˆ=argw1 minLc(w1) (10)
The wideband ML-STBF using a
single w1 must instead optimize
equation (9) by coherently combining
information from all frequencies which
results in a highly nonlinear
problem.[1],[10]
5. Properties of the Cost Function
The fuunction Lc(w1) is clearly
nonconvex, due to presence of
logarithms, and it is not even guaranted
to be lower bounded or to have a unique
minimizer . Nonconvexity hampers the
use of classical Newton optimization
algorithms when initialized far from the
global minimum. But,if a ζjˆ2
(w1)
becomes zero during adaptation,
indicating perfect signal cancellation in
the jth
subband ,then Lc(w1) →-∞ and the
minimization cannot proceed further . If
multiple bins have ζj ˆ2
(w1) = 0 , then
many local minima may likely occur and
a descent algorithm may get stuck before
reaching a global optimum. The two
theorems associated with Cost Function
show the effect of lack of information
occurring when L ≤ Mb. Despite of
these limitations , two other properties
of equation (9) appear extremely
interesting from both theoretical and
practical viewpoints i.e Scaling
invariance in the frequency domain and
the Link with Cepstral Analysis. A
decisive advantage of the ML –STBF
over cepstral processing lies in the
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-6
intrinsic linearity of beamforming
which is highly desirable when dealing
with music, speech or digital
transmission of data.[1]
6. Robustness of the ML-STBF
The cost function given by equation (9)
grows logarithmically i.e. very slowly
with respect to each subband error
variance ζjˆ2
(w1). This behaviour is
typical of statistically robust estimators
that are able to cope with outliers in the
data or significant deviation from the
assumed probabilistic model . As a result
, the performance of traditional
frequency domain MV beamforming
might result quiet suboptimal in the
presence of coloured source and
interferences. The following quadratic
constraint is deduced ,
|w1|2≤(1-δ/εmax)
2≈γ
2 (11)
Equation (11) theoretically justifies the
common practice of limiting the norm of
w1 in MV beamforming and furnishes a
guideline for properly choosing the
parameter γ2 . [1],[3]
7. Iterative LS Minimization of
Lc(w1)
Signal nonstationarity and moving
sources require short observation times
and fast numerical conversion to the
optimal solution, therefore function
Lc(w1) should be minimized by second
order Newton like algorithm in order
to be competitive with the MV approach
in real time applications. Therefore, we
have,
w1[q]
= argw1min
[Σj=j1j2
(Σl=1L|y(ωj,l,w1)|
2)/Lζj
ˆ2(w1
[q-1] )]
(12)
For q= 1,2…, subject to | w1[q]
|2
≤
γ2 until convergence is achieved.
Equation (12) is a standard quadratic
ridge regression problem.[1]
8 . Modified Newton Algorithm
The ML–STBF can be interpreted as a
two layer perceptron with constrained
weights. Therefore, the algorithms
developed for fast neural network
training should be highly effective .The
descent in the neuron space is adapted in
this wok. In this case ,the minimization
of Lc(w1) is still converted into an
iterative LS procedure but using a single
system matrix for all steps .Only matrix
sums and products are performed at each
iterations . The modified Newton
algorithm consists of three steps as 1)
Data Preconditioning, 2) System Setup
and 3) Main Loop[1]. The summary of
the algorithm is given below :
8.1 Algorithm Summary
Step 1 ) Collect N= LJ snapshots x(n)
for n= 1,..N.
Step 2) Compute frequency domain
snapshots x(ωj , l) for l= 1,2,..L and j=
1,2,..J using a windowed FFT of Length
J applied to L sequential blocks of x(n).
Step 3) For j = j1 , j2 , synthesize
focusing matrices Tj and compute
focused snapshots xf (ωj , l)= Tjx( ωj , l ).
Step 4) For j = j1 , j2 build y0(ωj , l)=w0xf
(ωj , l) and y1((ωj , l)=CH
┴ xf (ωj , l) .
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-7
Step 5) For each j , build regularized
matrices Rj,µ.
Step 6) Compute the system matrix F
and the vector g.
Step 7) Initialize w1[0]
with all zeros and
small complex random values .
Step 8) For q=1,2,, iterate until
convergence to w1ˆ
and solving of the
LS system.
Step 9) Compute the optimal weight
vector w1ˆ
and or compute the output
sequence y(ωj,l, w1).[1]
9. Steered Minimum Variance
Beamforming (MV-STBF)
The Steered Minimum variance (STMV)
is defined by finding the beamformer
weight vector w which minimizes the
beam power given by equation (13)
subject to the constraint that the
processor gain is unity for a broad-band
plane wave in direction θ. The problem
alternatively can be viewed as one of
estimating the dc component of the
STCM steered in direction θ by means
of minimum variance (MV) approach. In
either case , this technique has the effect
of choosing w to minimize the power
contribution from the sources and noise
not propagating from direction θ. The
solution is derived by several authors
and the resulting STCM based spatial
spectral estimate , denoted as STMV
method ,is given by
Zstmv(θ)=[1HR(θ)
-11]
-1 (14)
Where 1 is an M X 1 vector of ones. A
finite –time estimate , Zˆstmv( θ ) of Zstmv
( θ ) is obtained by substituting the
estimate Rˆ(θ) in place of R(θ) in
equation (14). The comparison of the
STMV method and CSDM based
minimum variance distortionless
response (MVDR) method is made
possible by expressing R(θ) as a sum of
cross-spectral density matrices.
Substituting the value of R(θ) in
equation (14) gives,
Zstmv(θ)=1/[1H[ Σk=l
h Tk(θ)K(ωk)Tk(θ)
H]
-1
(15)
Observe that in the case when h=l
equation (15) can be rewritten as ,
Zstmv(θ)=[1HTl(θ)K(ωl)Tl(θ)
H]
-1 (16)
Where the identity Tk(θ)-1
= Tk(θ)H. Note
that Tl(θ)H.1 = Dl(θ) ,the direction vector
of an arrival at frequency ωl and
direction θ. Hence , equation (16)
becomes ,
Zstmv(θ)=[Dl(θ)HK(ωl)
-1Dl(θ)]
-1 (17)
The equation (17) is precisely the
MVDR or maximum likelihood spatial
spectral estimate. Thus, in narrowband
case the STMV reduces to conventional
MVDR method. For broad-band sources,
the MVDR beampower is obtained by
summing narrow-band beampowers
over the band of interest i.e.
Zmvdr(θ)=Σk=lh[Dk(θ)
HK(ωk)
-1Dk(θ)]
-1 (18)
With a finite –time observation ,an
estimate ,Z^ mvdr(θ), can be computed by
substituting
K^(ωk) for its true value K(ωk). The
comparison of equations (15) and (18)
reveals the essential difference between
the STMV and MVDR methods for
broad-band signals . Specially in
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-8
equation (15), cross-spectral density
matrices are coherently averaged prior
to matrix inversion while in equation
(18) the matrix inversion is applied to
individual narrow-band CSDM’s prior to
averaging . While asymptotically ,the
STMV method is strictly suboptimal
,when only a limited number of data
snapshots are available. The coherent
averaging in equation (15) provides a
more statistically stable matrix to invert
thus facilitating more accurate spatial
spectral estimation. We can estimate the
steered covariance matrix by calculating
R(θ),K(ωk),K^(ωk), and R
^(θ).
The steered covariance matrix is
estimated as ,
R(θ)=Σk=lhTk(θ)K(ωk)Tk(θ)
H (19)
where K(ωk) is given as,
K(ωk) = E{Y(k) Y(k)H
} is the
conventional unsteered CSDM at
frequency ωk . The above equation
expresses the STCM in the same form
as coherently focused covariance matrix
proposed by Wang and Kaveh for the
case where all the sources in a field are
in a single group, unresolved by
conventional Direct-Spread (DS)
beamformer. In the coherent subspace
method, the equation (19) is appropriate
only in the single group case since just
one focused covariance matrix is formed
where each source has a rank one
characterization. In the STCM methods
R(θ) is calculated for each steering
direction θ of interest. The need to
compute R(θ) for each θ makes STCM-
based methods more computationally
intensive than coherent subspace
methods. However, it avoids the
problem of source location bias
resulting from errors made in forming
focusing matrices. The relationship
between K(ωk), k=l,….,h and R(θ) as
given in the equation (19) suggests a
natural way of estimating R(θ) by using
finite time CSDM estimates , K^(ωk). A
common method of forming K^(ωk) from
discrete-time sensor outputs is to divide
the T second observation into N
nonoverlapping segments of ΔT
seconds each and then apply the Discrete
Fourier Transform (DFT) to obtain
uncorrelated frequency domain vectors,
Yn(k), for each segment n=1,…,N. The
cross-spectral density matrix at
frequency ωk is then estimated by
taking ,
K^(ωk)=1/NΣn=1
NYn(k)Yn(k)
H (20)
Substituting K^(ωk) in place of its true
value ,K(ωk) in equation (19) gives an
estimate of the steered covariance matrix
R^(θ) such that,
R^(θ)=Σk=l
hTk(θ)K
^(ωk)Tk(θ)
H (21)
Note that the efficient computation of
R^(θ) from equation (21) can be
achieved by using the Fast Fourier
Transform (FFT) to obtain the Yn(k)
from discrete-time sensor outputs.[10]
The various steps used to perform the
STMV method are as follows:
1) Form the estimated cross –
spectral density matrices , Kˆ(ωk)
, over the frequency band of
interest as given in (20).
2) Compute the estimated steered
covariance matrices , Rˆ(θ ),for
each steering direction θ as given
in (21).
3) Compute Rˆ(θ )
-1 and form
Zˆstmv(θ) = [1
H R
ˆ(θ )
-1 1]
-1 for
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-9
each steering direction θ to
obtain a broad-band spatial
power spectral estimate as shown
by equation (14). Note that the
estimation of the source location
is achieved by determining the
peak positions of the spatial
power spectral estimate Zˆ
stmv(θ) .[10]
10 . Conclusion
The ML-STBF and MV-STBF were
tested for 1)Far- Field point sources , 2)
Mediterranean Vertical array data and
3) Reverberant room . All these
demonstrated the higher performance
and robustness of novel ML-STBF over
MV-STBF.[1],[4],[5],[6] .
The ML-STBF is based on concentrated
ML cost function in the frequency
domain and trained by fast modified
Newton algorithm. The ML cost
function performs a direction –
dependent spectral whitening on the
beamformer output . The computational
cost of the ML-STBF and MV-STBF are
comparable in most cases and dominated
by the common preprocessing of
wideband array data[7],[8],[9][10].
References
[1] E.D.Claudio and R Parisi “ Robust
ML Wideband beamforming in
reverberant fields” , IEEE Transactions
on Signal processing, vol 51,no.2, pp
338-349,Feb. 2003.
[2] E.D.Claudio and R Parisi “ Waves:
Weighted Average of Signal Subspaces
for robust wideband direction finding”,
IEEE Transactions on Signal
processing, vol.49, pp. 2179-2121, Oct.
2001.
[3] D.H. Johnson and D.E. Dudgeon,
“Array Signal Processing”, Englewood
Cliffs, NJ:Prentice Hall,1993.
[4] J.L. Krolik, “ The performance of
matched –field beamformers with
Mediterranean vertical array data,”
IEEE Transactions on Signal
Processing, vol. 44,pp. 2605-2611,Oct .
1996.
[5] G.Xu, H.P.Lin, S.S.Jeng, W.J.Vogel ,
“Experimental studies of spatial
signature variation at 900 MHz for smart
antenna systems,” IEEE Transactions on
Antennas propagation, vol. 46, pp.953-
962, July 1998.
[6] M.Agrawal and S.Prasad ,“ Robust
Adaptive beamforming for wideband
moving and coherent jammers via
uniform linear arrays,” IEEE
Transactions on Signal processing , vol.
47, pp. 1267-1275, Aug. 1999.
[7] Q.G. Liu, B.Champagne and
P.Kabal,, “A microphone array
processing technique for speech
enhancement in a reverberant space,”
Speech Communication, vol. 18, pp.
317-334,1996.
[8] B.Champagne, S.Bedard and
A.Stephenne, “Performance of time-
delay estimation in the presence of room
reverberation,” IEEE Transactions on
Speech ,Audio processing, vol.4, pp.148-
152, Mar. 1996.
[9]D.N.Swingler ,“ A low complexity
MVDR beamformer for use with short
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-
27 2011
SIP0108-10
observation times,” IEEE Transactions
on Signal processing, vol. 47, pp. 1154-
1160, Apr. 1999.
[10] J.Krolik and D.N.Swingler,“
Multiple wideband source location
using steered covariance matrices,”
IEEE Transactions on Acoustics, Speech
and signal processing,vol.37, pp. 1481-
1494,,Oct. 1989.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0109-1
Electricity Generation by People Walk through
Piezoelectric Shoe: An Analysis
1.Dr. Monika Jain,
2.Ms. Usha Tiwari,
3.Mohit Gupta,
4.Magandeep singh Bedi
1.Member IEEE, IETE, Professor-Dept of Electronics & Instrumentation Engg,
Galgotias College of Engineering & Technology,Greater Noida,UP, INDIA 2.
Assistant Professor-Dept of Electronics & Instrumentation
Galgotias College of Engineering & Technology,Greater Noida,UP, INDIA .3&4.
B.Tech, 4th
year student Dept of Electronics & Instrumentation
Galgotias College of Engineering & Technology,Greater Noida,UP, INDIA 1
Abstract— In todays, high crisis of
electrical power, there has been an
increasing demand for low-power and
portable-energy sources due to the
development and mass consumption of
portable electronic devices.
Furthermore, the portable-energy
sources must be associated with
competitive market price,
environmental issues and other imposed
regulations. These tremendous demands
support lots of research in the area of
portable-energy generation methods. In
this scope, piezoelectric materials has
always been chosen as an attractive
choice for energy generation and
storage. In this paper, different
techniques are being explored and
analysed to generate electricity by usage
of piezoelectric crystal. In-depth study
and analysis to describes the use of
piezoelectric polymers in order to utilize
and the best optimisation of energy
from people-walk and the fabrication of
a smart shoe, capable of generating and
accumulating the energy has peen
presented.
Keywords— Energy harvesting, PZT,
uninterrupted power supplies.
I. INTRODUCTION
Piezoelectric generators are based on
piezoelectric effect i.e. the ability of
certain materials to create electrical
potential when responding to mechanical
changes. In real time application, when
compressed or expanded or otherwise
changing shape a piezoelectric material
will output certain voltage. This effect is
also possible in reverse in the sense that
putting a charge through the material will
result in it changing shape or undergoing
some mechanical stress. These materials
are useful in a variety of ways. Certain
piezoelectric materials can handle high
voltage extremely well and are useful in
transformers and other electrical
components. Piezoelectric crystals are
boon of sensor technology field as it might
be possible to make motors, reduce
vibrations in sensitive environments, used
as an energy collector and in many more
applications. In today’s power crisis world,
one of the most interesting area is energy
collection and generation. In this paper, a
cheap and smart however a reliable
mechanism to generate energy capable
enough to charge our phone, MP3 players
has been explored and analysed. An
interesting methodology of power
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0109-2
generation through the walking steps of
human being is reviewed and presented
here. The sole of shoe could be constructed
of piezoelectric materials and every step a
person took would begin to generate
electricity. This smart mechanism of
generation of electricity through shoe sole
could then be stored in a battery or used
immediately in personal electronics
devices.
II. LITERATURE REVIEW
The most common methodology of
shoe power generators include dielectric
elastomers [1] and piezoelectric ceramics
[2,3]. The elastomer demonstrated
significant power output but it required a
large bias (2 kV) and the heavy
construction is likely to negatively affect
the user experience. The power harvesting
shoe reported in [2] and [3] uses
piezoelectric ceramic bi-morphs for power
harvesting. As piezoelectric materials were
employed, no bias voltage was needed.
However, a complex PZT/metal bi-morph
was required and the power output after
dc/dc conversion and regulation was low
(<1 mW) [2]. The schematic of
microstructured piezoelectric polymer film
that is used for the power generation as
shown in below figure1.
Microstructure Piezoelectric Polymer
Film
To increase the transducer power output,
the film is rolled into a 1-cm thick stack of
approximately 120 layers. The generated
charge per step is Q e33Fh/Yt where
e33 is the piezoelectric coefficient for
compression, F = mg is the force exerted
by the foot determined by the mass of the
user m and the gravity constant g =9.81
m/s2, Y is the Young’s modulus for the
film, h is the total transducer height, t is
the film thickness, and N is the number of
film layers in the transducer [6]. The
piezoelectric polymer power generator and
conversion circuit provide over 2 mW of
regulated power at 4.5 V. The transducer is
low cost, ecological, and soft suitable
shock absorption inside heel. The design
of electromagnetic generators that can be
integrated within shoe soles is described.
In this way, parasitic energy expended by a
person while walking can be tapped and
used to power portable electronic
equipment. Designs are based on discrete
permanent magnets and copper wire coils,
and it is intended to improve performance
by applying micro-fabrication
technologies. The proposed approach is
good in an aspect that voltage level are
comparable with piezoelectric generator
however, its complex circuitry is a
constraint. Vibration based generators
using three types of electromechanical
transducers: electromagnetic [8],
electrostatic [9], and piezoelectric [10-11]
have also been presented.
In all of these methods, vibrations consist
of a traveling wave in or on a solid
material, and it is often not possible to find
a relative movement within the reach of a
small generator. Therefore, one has to
couple the vibration movement to the
generator by means of the inertia of a
seismic mass.
Energy Storage Density Comparison
Type Practical
Maximu
m
Aggressiv
e
Maximum
Piezoelectric 35.4 335
Electrostatic 4 44
Electromagneti
c
24.8 400
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0109-3
There are two types of piezoelectric
signals that can be used for technological
applications: the direct piezoelectric effect
that describes the ability of a given
material to transform mechanical strain
into electrical signals and the converse
effect, which is the ability to convert an
applied electrical solicitation into
mechanical energy. The direct
piezoelectric effect is more suitable for
sensor applications, whereas the converse
piezoelectric effect is most of the times
required for actuator applications[12].
High-performance films, prepared by
researchers [14-15] is also explored. In this
the electromechanical properties of the
film were improved by a treatment that
consists of pressing, stretching, and poling
at a high temperature [14].
III. CONCLUSION
In this paper, an analysis for Electricity-
Genration for low power devices is done.
Different methodologies for generation of
electricity is reviewed and presented. We
analysed that some of the methodologies
are not feasible due to too much circuitry
in real time portable charging and some
are feasible but they are on an analysis
stage. We have found that piezoelectric
generators implanted in shoe can provide
a great achievement if collaboratively an
effort is made to bring a commercial
battery charger for low power house
devices, just by utilization of walking steps
of a person.
REFERENCES
[1] Roy Kornbluh, “Power from plastic:
how electroactive polymer artificial
muscles will improve portable ower
generation in the 21st century military,”
Presentation [Online],
Available:http://www.dtic.mil/ndia/2003tri
service/korn.ppt
[2] John Kymisis, et.al., “Parasitic power
harvesting in shoes” in Proc. of the 2nd
IEEE Int. Conf. On Wearable Computing,
Pittsburgh. PA, pp. 132-139, 19-20 Oct.
1998.
[3] S. Shenck and J. Paradiso, “Energy
scavenging with shoe-mounted
piezoelectrics”, IEEE Micro, Vol. 21, pp.
30-42, May-June, 2001.
[4] P. Miao, et.al., “Micro-Machined
Variable Capacitors for Power
Generation”, in Proc. Electrostatics
Edinburgh, UK, 23-27 Mar. 2003.
[5] Mitcheson, P.D.; Green, T.C.;
Yeatman, E.M.; Holmes, A.S.,
"Architectures for vibration-driven
micropower generators," Journal of
Microelectromechanical Systems, vol.13,
no.3, pp. 429-440, June 2004.
[6] Ville Kaajakari, “Practical MEMS”,
Small Gear Publishing, 2009.
[7] M.Duffy & D.Carroll,”
Electromagnetic generators for power
harvesting” 35th AM^ IEEE Power
Electronics Specialists Conference
Aachen, Germany, 2004 ;pp. 2075-2081
[8]M. El-hami, P. Glynne-Jones, M.
White, M. Hill, S. Beeby, E. James, D.
Brown, and N. Ross, “Design and
fabrication of a new vibration-based
electromechanical power generator,”
Sens. Actuators A, Phys., vol. 92, no. 1–3,
pp. 335–342, Aug. 2001.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0109-4
[9] M. Miyazaki, H. Tanaka, G. Ono, T.
Nagano, N. Ohkubo, T. Kawahara, and K.
Yano, “Electric-energy generation using
variablecapacitive resonator for power-
free LSI,” in Proc. ISLPED, 2003, pp.
193–198.
[10] C. Keawboonchuay and T. G. Engel,
“Maximum power generation in a
piezoelectric pulse generator,” IEEE
Trans. Plasma Sci., vol. 31, no. 1, pp. 123–
128, Feb. 2003.
[11] J. Yang, Z. Chen, and Y. Hu, “An
exact analysis of a rectangular plate
piezoelectric generator,” IEEE Trans.
Ultrason., Ferroelectr., Freq. Control, vol.
54, no. 1, pp. 190–195, Jan. 2007.
[12] T. Sterken, P. Fiorini, K. Baert, R.
Puers, and G. Borghs, “Anelectret-based
electrostatic micro-generator,” in Proc.
Transducers,2003, pp. 1291–1294.
[14] V. Sencadas, R. Gregorio Filho, and
S. Lanceros-Mendez, “Processing and
characterization of a novel nonporous
poly(vinilidene fluoride) films in the β
phase,” J. Non-Cryst. Solids, vol. 352, no.
21/22, pp. 2226–2229, Jul. 2006.
[15] S. Lanceros-Mendez, V. Sencadas,
and R. Gregorio Filho, “A new
electroactive beta PVDF and method for
preparing it,” Patent PT103 318, Jul. 19,
2006.
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0110-1
Abstract-This paper describes non-parametric approach
for spectral analysis using three different window
functions with three power spectrum estimation
techniques. Window functions used are Hamming,
Blackman & Modified Bartlett Hanning window for
power spectral estimation using Periodogarm, Welch &
autocorrelation as estimation methods. The role of these
different Window functions has been analyzed in terms
of spectral leakage and scalloping loss. And the
objective of using three different techniques for power
spectral density estimation is to find out the BW of the
signal. This work has been further extended to spectral
analysis in voice signals to detect the fundamental
frequency of the speaker. The frequency domain
cepstrum analysis for voiced speech segments is also
used. This is conventional method of fundamental peak
picking i.e. fundamental frequency or pitch. The voice
segments of different speakers with minimum 30dB
SNR as a threshold has been taken and cepstrum has
been analyzed using different window functions.
Index Terms— Autocorrelation, Cepstrum, MBH,
periodogram, PSD, Pitch, spectrum, welch.
I. INTRODUCTION
ower Spectral Estimation is the method of
determining the power spectral density (PSD) of
a random process that provides the information about
the structure of spectrum. The purpose of estimating
the spectral density is to detect any periodicities in
the data, by observing peaks at the frequencies
corresponding to these periodicities. For spectrum
estimation the two approaches are linear and non-
linear methods. In linear approach, the task is to
estimate the parameters of the model that describes
the stochastic process. While the non-linear
estimation is based on the assumption that the
observed samples are wide sense stationary with zero
mean[3]
. So the spectral analysis of a noise like
random signal is usually carried out by nonlinear
methods like Periodogram, Welch etc. To analyze the
non-linear method first we have to see the role of
different window functions. In spectral analysis the
discontinuity resulting from the periodic extension of
the signal which gives rise to the leakage at the end
points and the high side lobe levels result in false
frequency detection which is reduced by the use of
window functions. The bin crossover that results in a
signal detection loss (scallop loss) due to the reduced
signal level at frequency points of the bin centers.
The window function modifies the frequency
responses which are used to reduce the bin crossover
losses.
Pitch is the fundamental parameter of speech[11]
.
Pitch detection is one of the important tasks of speech
signal processing[5],[7],[9]
. Pitch i.e. fundamental
frequency of voice signals (varies from 40Hz to
600Hz). Accurate representation of voiced/unvoiced
character of speech plays an important role in Voice
activity detection(VAD), coding, synthesis, speech
training, speech and speaker recognition systems and
vocoders[6],[8]
. For accurately detect and estimate the
fundamental frequency of a speaker we use cepstrum [5]
analysis which is also called spectrum of spectrum.
It is used to separate the excitation signal (pitch) and
transfer function (voice quality). One of these
algorithms that show good performance for quasi-
periodic signals is the cepstrum (CEP) algorithm
However, its ability to separate the source signal
(that conveys pitch information) from the vocal
tract response fails wherever the speech frame cannot
be contemplated as just the result of a linear
convolution between both components, as occurs
transitions or non-stationary speech segments, or
Spectral and Cepstral analysis Using Modified
Bartlett Hanning Window
P
Rohit Pandey1, Rohit Kumar Agrawal
1, Sneha Shree
1
Department of Electronics & Communication Engineering 1 Jaypee University of Engineering & Technology, Guna, MP, India [email protected],[email protected]
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0110-2
when the recorded speech signal includes additive
noise [5],[7]
.
II. WINDOW FUNCTIONS
These are the window functions used for spectrum and
cepstrum analysis.
MBH[1] are used for the estimation techniques. Modified
Bartlett-Hanning (MBH) window is extended to the form [1]
w(t,α)=α-(4α-2)|t|+(1-α)cos2πt; |t| ≤0.5, 0.5≤α <1.88 -(1)
Where α is index parameter.
Blackman window:
W(n)=.42-.50cos((2πn)/(M-1))+.08cos((4πn)/(M-1)) -(2)
Hamming window:
W(n)=.56-.42cos((2πn)/(M-1)) -(3);
Where M is window length and n is number of samples
n=0:M-1;
III. SPECTRAL ESTIMATION TECHNIQUES
Periodogram method[3]
as sequence x(a) is to be made
finite by using window functions. Now the windowed
sequence x(n) is autocorrelated and periodogram is
calculated by-
21
0
][1
)(N
n
jnj
N enxN
eI
-(4)
Where N is the length of the finite sequence.
Welch method [2] in Welch process firstly the Sectioning of
data is done according to the sequence length. For each
section of length ‗m‘ we calculate a modified periodogram
[3] by
21
0
][1
)(N
n
jn
r
jr emxN
eIN
-
(5)
of each section and average it.
Auto-correlation method [3] is extracting the similarities
between the signals which is given by-
mN
n
xx nxmnxm1
0
][][][
-(6)
The sequence x(a) is windowed and autocorrelated and psd
is calculated by -
)( fxx
1
0
][N
n
jn
xx em . –(7)
IV. CEPSTRUM ANALYSIS
Cepstrum is a frequency domain analysis of voiced
speech segments. Real cepstrum is inverse Fourier
transform of log magnitude of Fourier transform.
Algorithm is: signal FT -> abs()-> log-> IFT.
It is similar to spectral analysis of signal but in the
cepstrum we take the logarithm of the spectrum[10]
.
This is due to the fact that speech signals are quasi-
periodic in nature and only spectrum analysis cannot
be very useful for the characteristic feature extraction
of the voice signals. While calculating cepstrum we
have taken speech sample of 25ms and with the
sampling frequency fs of 8000Hz.
V. APPROACH FOR CEPSTRUM ANALYSIS
Normalization windowing*
IFFT
Log
abs
FFT
windowing
LPF
Voice
Signal
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0110-3
Smooth cepstrum
This procedure for smoothing the composite log
spectrum to obtain the log spectral envelop is referred
as cepstral smoothing.
VI. SIMULATION AND RESULTS
Sinusoidal signal has been taken as input signal with two
different frequency components and amplitudes
A*sin(2πft);A=[2 1.5];f=[150;175].
To estimate the spectrum of a noisy signal the sinusoidal
signal must be added with random sequence which is
generated in MATLAB.
Fig.1 (Peridogram method)
TABLE-I(bandwidth of
periodogram[4])
Fig.2(Welch method)
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0110-4
TABLE-II(bandwidth of
welch[4])
Fig.3 (Autocorrelation method)
TABLE-III(bandwidth of auto-correlation[4])
For cepstrum analysis we have taken the voice
samples of two speakers of duration 25 ms each.
And after that we have passed the voice samples
through the low pass filter of cut-off frequency
0.15*pi. We have used low pass filter here to
eliminate the high frequency additive noise and
analyzed the cepstrum of the filtered voice samples
for pitch detection.
Windowing is done after IFFT for smoothing
of cepstrum and detecting the clear cepstral peaks.
Fig.4(Cepstrum Analysis)
Fig. 5(Smooth Cepstrum)
VII. CONCLUSION
The aim is to detect and estimate the signal [3]. For the
identification of two different frequency components in a
presence of noise different threshold levels has been taken
starting from -3dB [4]. In periodogram method (Fig.1) -
3dB,-6dB and -15dB of threshold is taken and it is
observed from the results (TABLE-I) that at -3dB two
sinusoidal peaks are not detected and beyond -15dB noise
is detected. Same is the case with autocorrelation PSD
method that at -3db (TABLE-III) no peaks are detected but
we can detect our signals up to -20dB in comparison to
peridogram method. But in the case of Welch method
(TABLE-II), (Fig.2) detection at -3dB is possible i.e. the
minimum threshold to detect the signal. As the Fourier
transform of sinusoidal signal is an impulse so in the Welch
method (TABLE-II) using MBH window, side lobe levels
are more suppressed and width of main become
narrower(tending to an impulse Fig 2) than Hamming and
Blackman windows .
The above comparison shows that the Welch method is
giving better results in comparison to periodogram and
autocorrelation method. Using Welch method with MBH
window gives more accurate results than Hamming and
Blackman.
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0110-5
Taking ―HELLO‖ as an iterative voice sample for
two speakers, we have estimated the average pitch. It is
observed that in cepstrum (fig.4) error in pitch detection is
more than smooth cepstrum (fig. 5). Now, considering 0.4
as threshold level we see that the periodicity in smooth
cepstrum is more distinguished and hence pith can easily be
detected. This can be further used in voice recognition
systems in order to minimize false acceptance rate (FAR)
and false rejection rate (FRR).
ACKNOWLEDGMENT
The authors acknowledge the valuable guidance of Prof
Rajiv Saxena who helped to improve the quality of the
paper.
REFERENCES
[1] IEEE TRANSACTIONS ON SIGNAL PROCESSING,
On the Modified Bartlett-Hamming Window (Family) by
Jai Krishna Gautam, Arun Kumar, and Rajiv Saxena, pg
2098-2102, VOL. 44, NO. 8, AUGUST 1996.
[2] IEEE TRANSACTIONS ON AUDIO AND
ELECTROACOUSTICS, The Use of FFT for the
Estimation of Power Spectra: A Method Based on Time
Averaging Over Short, Modified Periodograms PETER D.
WELCH ,pg 70-73, VOL. AU-15, NO. 2, JUNE 1967.
[3] Digital Signal Processing by David J. Defatta, Joseph
G. Lucas, Willam S. Hodgkiss..
[4] F. J. Hams, ―On the use of windows for harmonic
analysis with the discrete Fourier transform,‖ Proc. ZEEE,
vol. 66, pp. 51-83, Jan. 1978.
[5] The Cepstrum Guide: A Guide to Processing by
Donald G. Childers,David P. Skinner and Robert C.
Kemerait, PROCEEDINGS OF THE IEEE, VOL. 65,
NO. 10, OCTOBER 1977, pp 1428-1443.
[6] Signal Modeling Techniques in Speech Recognition by
JOSEPH W. PICONE, SENIOR MEMBER, IEEE, PROCEEDINGS OF THE IEEE, VOL. 81, NO. 9,
SEPTEMBER 1993, pp 1215-1247.
[7] ―Ceptrum pitch determination,‖ J. Acoust. SOC.
Am., vol. 41, no. 2, pp. 293-309, Feb. 1967.
[8] A Tutorial on Text-Independent Speaker Verification, EURASIP Journal on Applied Signal Processing 2004:4,
430–451.
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0110-6
[9] http://en.wikipedia.org/wiki/Cepstrum
[10] Spectrum Analysis in Speech Coding, James L.
Flanagan Senior Member IEEE, IEEE TRANSACTIONS
ON AUDIO AND ELECTROACOUSTICS, VOL. AU-
15, NO. 2, JUNE 1961, pp 66-69.
[11] http://en.wikipedia.org/wiki/Human_voice
1
A 3D APPROACH TO FACE-EXPRESSION RECOGNITION
Akshay Gupta , Ananya Misra , Hridesh Verma , Garima Chandel-Member IEEE
ABES Institute of Technology, Ghaziabad-201009, India.
[email protected] , [email protected] , [email protected] , [email protected]
ABSTRACT: Face recognition has been in research for the last couple of decades. With the advancement of 3D imaging technology, 3D face recognition emerges as an alternative to overcome the problems inherent to 2D face recognition, i.e. sensitivity to illumination conditions and positions of a subject. But 3D face recognition still needs to tackle the problem of deformation of facial geometry that results from the expression changes of a subject. To deal with this issue, a 3D face recognition framework is proposed in this paper. It is combination of three subsystems: expression recognition system, expressional face recognition system and neutral face recognition system. A system for the recognition of faces with one type of expression (smile) and neutral faces was implemented and tested on a database of 30 subjects. The results proved the feasibility of this framework.
Index Terms- face recognition, databases, neutral face, smiling face, image acquisition.
I. INTRODUCTION
Mostly the face recognition attempts that have been made use of 2D intensity images as the data format for processing. In spite of the success reached by 2D recognition methods, certain problems still exist. 2D face images not only depend on the face of a subject, but also depend on imaging factors, such as the environmental illumination and the orientation of the subject. These variable factors can become the cause of the failure of the 2D face recognition system. With the advancement of 3D imaging technology, more attention is given to 3D face recognition, which is robust with respect to illumination variation and posing orientation. In [1], Bowyer et al. provide a survey of 3D face recognition technology. Mostly the 3D face recognition systems treat the 3D face surface as a rigid surface. But actually, the face surface is deformed by different expressions of the subject, which causes the failure of the systems that treat the face as a rigid surface. The involvement of facial
expression has become a big challenge in 3D face recognition systems. In this paper, we propose an approach to tackle this problem, through the integration of expression recognition and face recognition in a system.
II. EXPRESSION AND FACE RECOGNITION
From the psychological point of view, it is still not known whether facial expression recognition information aids the recognition of faces by human beings. It is found that people are slower in identifying happy and angry faces than they are in identifying faces with neutral expression.The proposed framework involves an initial assessment of the expression of an unknown face, and uses that assessment to assist the progress of its recognition. The incoming 3D range image is processed by an expression recognition system to find the most appropriate expression label for it. The expression labels include the six prototypical expressions of the faces, which are happiness, sadness, anger, fear, surprise and disgust, plus the neutral expression. According to different expressions, a matching face recognition system is then applied. If the expression is recognized as neutral, then the incoming 3D range image is directly passed to the neutral expression face recognition system, which uses the features of the probe image to directly match those of the gallery images, which are all neutral, to get the closest match. If the expression found is not neutral, then for each of the six expressions, a separate face recognition subsystem should be used. The system will find the right face through modelling the variations of the face features between the neutral face and the face with expression. Figure 1 shows a simplified version of this framework. This simplified diagram only deals with the smiling expression, which is the most commonly displayed by people publicly.
III. DATA ACQUISITION AND PROCESSING
To test the approach proposed in this model, a database, which includes 30 subjects, was built. In
2
this database, we test the different processing of the two most common expressions, i.e., smiling versus neutral. Each subject participated in two sessions of the data acquisition process, which took place in two different days. In each session, two 3D scans were acquired with a Polhemus Fastscan scanner. One was a neutral expression; the other was a happy (smiling) expression. The resulting database contains 60 3D neutral scans and 60 3D smiling scans of 30 subjects.
Figure1- Simplified framework of 3D face recognition
The left image in Figure 2 shows an example of the 3D scans obtained using this scanner, the right image is the 2.5D range image used in the algorithm.
Figure 2- 3D surface (left) and a mesh plot of the converted range image (right)
IV. EXPRESSION RECOGNITION
The face expression is a basic mode of nonverbal communication among people. In [5], Ekman and Friesen proposed six primary emotions. Each possesses a distinctive content together with a unique facial expression. These six emotions are happiness, sadness, fear, disgust, surprise and anger. Together with the neutral expression, they also form the seven basic prototypical facial expressions.
In our experiment, we aim to recognize social smiles, which were posed by each subject. Smiling is
generated by contraction of the zygomatic major muscle. This muscle lifts the corner of the mouth obliquely upwards and laterally, producing a characteristic “smiling expression”. So, the most distinctive features associated with the smile are the bulging of the cheek muscle and the uplift of the corner of the mouth, as shown in Figure 3. The following steps are followed to extract six representative features for the smiling expression:-
1. An algorithm is developed to obtain the coordinates of five characteristic points in the face range image as shown in Figure 3. A and D are the extreme points of the base of the nose. B and E are the points defined by the corners of the mouth. C is in the middle of the lower lip.
Figure 3- Illustration of features of a smiling face versus a neutral face
2. The first feature is the width of the mouth, BE, normalized by the length of AD. Obviously, while smiling the mouth becomes wider. The first feature is represented by mw.
3. The second feature is the depth of the mouth (The difference between the Z coordinates of point B point C and point E point C) normalized by the height of the nose to capture the fact that the smiling expression pulls back the mouth. This second feature is represented by md.
4. The third feature is the uplift of the corner of the mouth, compared with the middle of the lower lip d1 and d2, as shown in the figure, normalized by the difference of the Y coordinates of point A point B and point D point E, respectively and represented by lc.
5. The fourth feature is the angle of line AB and line DE with the central vertical profile, represented by ag.
6. The last two features are extracted from the semicircular areas shown, which are defined by using line AB and line DE as diameters. The histograms of the range (Z coordinates) of all the points within these two semicircles are calculated.
Figure 4 shows the histograms for the smiling and the neutral faces of the subject in Figure 3. The two figures in the first row are the histograms of the range
3
values for the left cheek and right cheek of the neutral face image; the two figures in the second row are the histograms of the range values for the left cheek and right cheek of the smiling face image.
Figure 4- Histogram of range of cheeks (L &R) for neutral (top row), and smiling (bottom row) face.
From the above figures, we can see that the range histograms of the neutral and smiling expressions are different. The smiling face tends to have large values at the high end of the histogram because of the bulge of the cheek muscle. On the other hand, a neutral face has large values at the low end of the histogram distribution. Therefore two features can be obtained from the histogram.
One is called the ‘histogram ratio’, represented by hr, the other is called the ‘histogram maximum’, represented by hm.
ℎ = ℎ6 + ℎ7 + ℎ8 + ℎ9 + ℎ10ℎ1 + ℎ2 + ℎ3 + ℎ4 + ℎ5hm = i; i = arg {max (h (i))}
After the six features have been extracted, this becomes a general classification problem. Two
pattern classification methods are applied to recognize the expression of the incoming faces. The first method used is a linear discriminant (LDA) classifier, which seeks the best set of features to separate the classes. The other method used is a support vector machine (SVM).
V. 3D FACE RECOGNITION
A. Neutral face recognitionIn our earlier research work, we have found that the
central vertical profile and the contour are both discriminant features for every person. Therefore, for neutral face recognition, the results of central vertical profile matching and contour matching are combined. The combination of the two classifiers improves the overall performance significantly. The final similarity score for the probe image is the product of ranks for each of the two classifiers (based on the central vertical profile and contour). The image with the smallest score in the gallery will be chosen as the matching face for the probe image.
B. Smiling face recognitionFor the recognition of smiling faces we have
adopted the probabilistic subspace method proposed by B. Moghaddam et al. [8,9]. It is an unsupervised technique for visual learning, which is based on density estimation in high dimensional spaces using Eigen decomposition. Using the probabilistic subspace method, a multi-class classification problem can be converted into a binary classification problem.In the experiment for smiling face recognition,because of the limited number of subjects (30), the central vertical profile and the contour are not used directly as vectors in a high dimensional subspace. Instead, they are down sampled to a dimension of 17 to be used. The dimension of difference in feature space is set to be 10, which contains approximately 97% of the total variance. The dimension of difference from feature space is 7.
In this case also, the results of central vertical profile matching and contour matching are combined, improving the overall performance. The final similarity score for the probe image is the product of ranks for each of the two classifiers. The image with the smallest score in the gallery will be chosen as the matching face for the probe image.
VI. EXPERIMENTS AND RESULTS
One gallery and three probe databases were used for evaluation. The gallery database has 30 neutral faces, one for each subject, recorded in the first data acquisition session. Three probe sets are formed as follows: Probe set 1: 30 neutral faces acquired in the second session.Probe set 2: 30 smiling faces acquired in the second session.Probe set 3: 60 faces, (probe set 1 and probe set 2).
0100200300
a b c d e f g h i j
Series1
0
200
a b c d e f g h i j
Series1
0
100
200
300
a b c d e f g h i j
Series1
050
100150
a b c d e f g h i j
Series1
Experiment 1: Testing the expression recognition module
The leave-one-out cross validation method is used to test the expression recognition classifier. Every time, the faces collected from 29 subjects in both data acquisition sessions are used to train the classifier and the four faces of the remaining subject collected in both sessions are used to test the classifier.classifiers are used. One is the linear discriminant classifier; the other is a support vector machine classifier. LDA tries to find the subspace that best discriminates different classes by maximizing the between class scatter matrix, while minimizing the within-class scatter matrix in the projective subspace. Support vector machine is a relatively new technology for classification. It relies on preprocessing the data to represent patterns in a high dimension, typically much higher than the feature space. With an appropriate nonlinear mapping to a sufficiently high dimension, data from two categories can always be separated by a hyper plane.
Table 1- expression recognition results
Method LDAExpression recognition rate 90.8
Experiment 2: Testing the neutral and smiling recognition modules separately
In the first two sub experiments, probe faces are directly fed to the natural face recognition module. In the third sub experiment, the leave-onevalidation is used to verify the performance of the smiling face recognition module.
a. Neutral face recognition: probe set 1.(neutral face recognition module used.)
b. Natural face recognition: probe set 2(neutral face recognition module used.)
c. Smiling face recognition: pro2(smiling face recognition module used).
From Figure 5, it can be seen that when the incoming faces are all neutral, the algorithm which treats all the faces as neutral achieves a very high recognition rate.
Figure 5 Results of Experiment 2(three sub-experiments)
0
0.5
1
1.5
a b c
rank 1 recognition rate
rank 3 recognition rate
Experiment 1: Testing the expression recognition
out cross validation method is used e expression recognition classifier. Every
time, the faces collected from 29 subjects in both data acquisition sessions are used to train the classifier and the four faces of the remaining subject collected in both sessions are used to test the classifier. Two classifiers are used. One is the linear discriminant classifier; the other is a support vector machine classifier. LDA tries to find the subspace that best discriminates different classes by maximizing the
minimizing the class scatter matrix in the projective subspace.
Support vector machine is a relatively new technology for classification. It relies on pre-processing the data to represent patterns in a high dimension, typically much higher than the original feature space. With an appropriate nonlinear mapping to a sufficiently high dimension, data from two categories can always be separated by a hyper plane.
SVM92.5
Experiment 2: Testing the neutral and smiling
In the first two sub experiments, probe faces are directly fed to the natural face recognition module. In
one-out cross used to verify the performance of the
Neutral face recognition: probe set 1.(neutral face recognition module used.)Natural face recognition: probe set 2(neutral
Smiling face recognition: probe set 2(smiling face recognition module used).
From Figure 5, it can be seen that when the incoming faces are all neutral, the algorithm which treats all the faces as neutral achieves a very high
experiments)
On the other hand, if the incoming faces are smiling, then the neutral face recognition algorithm does not
These experiments emulate a realistic situation in which a mixture of neutral and smiling faces (probe set 3) must be perform well, only 57% rank one recognition rate is obtained. (Rankone means only the face which scores highest is selected from the gallery. Rank one recognition rate is the ratio between number of faces correctly recognized and the number of probe faces. Rank three means three highest scored faces instead of one face are selected.) In contrast, when the smiling face recognition algorithm is used to deal with smiling faces, the recognition rate can be as high as 80%.
Experiment 3: Testing a practical scenario
These experiments emulate a realistic situation in which a mixture of neutral and smiling faces (probe set 3) must be recognized. Sub experiment 1 investigates the performance obtained if the expression recognition front end is bypassed, and the recognition of all the probe faces is attempted with the neutral face recognition module alone. The last two sub experiments implement the full framework shown in Figure 1. In 3.2 the expression recognition is performed with the linear discrimwhile in 3.3 it is implemented through the support vector machine approach.
a. Neutral face recognition module used alone: probe set 3 is used.
b. Integrated expression and face recognitioprobe set 3 is used. (Linear discriminant classifier for expression recognitio
c. Integrated expression and face recognition: probe set 3 is used.(support vector machine for expression recognition.)
It can been seen in Figure 6 that if the incoming faces include both neutral faces and smiling faces, the recognition rate can be improved about 10 percent by using the integrated framework proposed here.
CONCLUSION
The work reported in this paper represents an attempt to acknowledge and account for the presence of expression on 3D face images, towards their improved identification. The method introduced here is computationally efficient. Furthermore, this method also yields as a secondary result the information of the expression found in the faces.Based on these findings we believe that the acknowledgement of the impact of expression on 3D face recognition and the development of systems that
rank 1 recognition
rank 3 recognition
4
On the other hand, if the incoming faces are smiling, then the neutral face recognition algorithm
These experiments emulate a realistic situation in which a mixture of neutral and smiling faces (probe
well, only 57% rank one recognition rate is obtained. (Rankone means only the face which scores highest is selected from the gallery. Rank one recognition rate is the ratio between number of faces correctly recognized and
ree means three highest scored faces instead of one face are selected.) In contrast, when the smiling face recognition algorithm is used to deal with smiling faces, the recognition rate can be as high as 80%.
Experiment 3: Testing a practical scenario
ese experiments emulate a realistic situation in mixture of neutral and smiling faces (probe
recognized. Sub experiment 1 investigates the performance obtained if the expression recognition front end is bypassed, and the
of all the probe faces is attempted with the neutral face recognition module alone. The last two sub experiments implement the full framework shown in Figure 1. In 3.2 the expression recognition is performed with the linear discriminant classifier,
n 3.3 it is implemented through the support
Neutral face recognition module used alone:
d expression and face recognition: probe set 3 is used. (Linear discriminant classifier for expression recognition.)Integrated expression and face recognition: probe set 3 is used.(support vector machine
It can been seen in Figure 6 that if the incoming faces include both neutral faces and smiling faces,
be improved about 10 percent by using the integrated framework proposed
The work reported in this paper represents an attempt to acknowledge and account for the presence of expression on 3D face images, towards their
identification. The method introduced here is computationally efficient. Furthermore, this method also yields as a secondary result the information of the expression found in the faces.Based on these findings we believe that the
act of expression on 3D face recognition and the development of systems that
account for it, such as the framework introduced here, will be keys to future enhancements in the field of 3D Automatic Face Recognition.
.
REFERENCES
[1] K. Bowyer, K. Chang, and P. Flynn, “A Survey of Approaches to 3D and Multi-Modal 3D+2D Face Recognition,” Conf. o Pattern Recognition, 2004.
[2] R. Chellappa, C.Wilson, and S. Sirohey, “Human and Machine Recognition of Faces: A Survey,” Proceedings of the IEEE, 83(5): pp. 705-740.
[3] www.polhemus.com.
[4] C. Li, A.Barreto, J. Zhai and C. Chin. “Exploring Face Recognition Using 3D Profiles and Contours,” IEEE SoutheastCon 2005. Fort Lauderdale.
[5] P.Ekman, W. Friesen, “Constants across cultures in the face and emotion,” Journal of Personality and Social Psychology1971. 17(2): pp. 124-129
[6] Y. Hu, D. Jiang, S. Yan, L. Zhang, and H. Zhang, "Automatic 3D Reconstruction for Face Recognition," presented at International conference on automatic face and gesture recognition, Seoul, 2004.
[7]"Notredame 3D Face Database, "http://www.nd.edu/~cvrl/.
[8].B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning for Object Detection,” International Conference of Computer Vision (ICCV' 95), 1995.
[9]B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning for Object Representation,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 1997. 19(7): pp. 696-710.
00.20.40.60.8
11.2
a b c
rank 1 recognition rate
rank 3 recognition rate
account for it, such as the framework introduced here, will be keys to future enhancements in the field
and P. Flynn, “A Survey of Approaches Modal 3D+2D Face Recognition,” IEEE Intl.
[2] R. Chellappa, C.Wilson, and S. Sirohey, “Human and Machine Proceedings of the IEEE, 1995.
C. Li, A.Barreto, J. Zhai and C. Chin. “Exploring Face IEEE SoutheastCon
“Constants across cultures in the face ” Journal of Personality and Social Psychology,
[6] Y. Hu, D. Jiang, S. Yan, L. Zhang, and H. Zhang, "Automatic 3D Reconstruction for Face Recognition," presented at
conference on automatic face and gesture
[7]"Notredame 3D Face Database, "http://www.nd.edu/~cvrl/.
[8].B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning International Conference of Computer
[9]B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning for ,” IEEE Trans. on Pattern Analysis and
rank 1 recognition rate
rank 3 recognition rate
5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-1
Performance Evaluation of Signal Selective DOA Tracking for
wideband cyclostationary sources
SANDEEP SANTOSH1, O.P.SAHU
2, MONIKA AGGARWAL
3
Astt. Prof., Department of Electronics and Communication Engineering1
,
Associate Prof., Department of Electronics and Communication Engineering2
,
National Institute of Technology , Kurukshetra, 1,2
Associate Prof., Centre For Applied Research in Electronics (CARE)3 ,
Indian Institute of Technology, New Delhi.3
INDIA
[email protected] http://www.nitkkr.ac.in
Abstract
In this paper ,we present a new signal-
selective direction of arrival (DOA) tracking
algorithm for moving sources emitting
narrowband or wideband cyclostationary
signals. Here, the DOAs of the sources are
updated recursively based on most current
array output in a way that no data
association is needed.The interference and
noise are suppressed by exploiting
cyclostationarity .Only, the sources of
interest are tracked.The tracking
performance of this algorithm can be
improved via the kalman filter.
Index Terms – Array signal processing,
cyclostationarity, direction of arrival
tracking.
1. Introduction
Direction of arrival (DOA) tracking of
multiple moving sources has been a central
research topic on signal processing for
decades ,due to its wide applications such as
survellience in military applications and air
traffic control in civilian applications. One
obvious method of DOA tracking is to first
find DOAs by an existing DOA estimation
algorithm for each time frame on
assumption that directions do not change
within each time frame ,then to associate
each of newly estimated DOAs to those
previous estimates in order to keep tracking
the DOA changes and source movement . A
major problem of this method is that data
association , or correctly assigning the
estimated DOAs
at each time frame to their corresponding
previous estimates to form DOA tracks,
requires extensive computations. Data
association involves searching over I
possible combinations between the
estimated DOAs and the targets, where I is
supposed to be number of DOAs[1]. Some
DOA tracking algorithms which do not
require data association have been proposed
such as [1]-[5].The authors of [1] obtain the
current DOA estimates of the sources by
minimizing the norm of an error matrix
function, based on a covariance matrix
related to an array output at the current time
frame. The authors of [2] track the source
movement by estimating DOA changes for
each time frame ,rather than new DOAs
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-9
through solving a least squares (LS)
problem. The authors of [3] improve the
performance of [2] by employing the source
movement model and refining the updated
DOAs through a Kalman filter. The authors
of [4] update the DOA estimates of each
time frame by solving a maximum–
likelihood (ML) problem of most current
array output. This approach also employs a
source movement model and refines the
DOA estimates through a Kalman filter as in
[3].The authors of [5] introduce multiple
target states (MTS) to describe the target
motion ,and the DOA tracking is
implemented through updating the MTS by
maximizing the likelihood function of the
array output. Whether by LS or ML method,
whether introducing MTS or other models to
describe the target motion , whether using
Kalman filter or not ,all these algorithms
implement the DOA tracking in a way that
the order of the estimated DOAs for
different times or time frames is maintained
, thus data association is avoided. Therefore,
they are more computationally efficient than
the methods requiring the data association.
All the above methods are applicable to
narrowband signals and they would fail for
wideband signals .Wideband signals are
becoming more and more common
nowdays. Therefore, research work on
developing DOA tracking algorithms that
work for wideband sources has been carried
out[6]-[8].The authors of [6] use focusing
matrices to align steering vectors of different
frequency bins to carrier frequency so that
wideband signals can be treated the same
way as narrowband signals in estimating the
DOAs by multiple signal
classification(MUSIC)[9].When new data
arrive ,[6] first updates the focusing matrices
and then applies MUSIC to obtain new
estimated DOAs. In [7], the authors estimate
the DOAs of each time frame by an ML
approach. For multiple targets both [6] and
[7] require data association. In [7], the data
association is done by Bayes classifier
which is computationally expensive. The
authors of [8] develop two computationally
simple methods for DOA tracking based on
recursive expectation and maximization
(REM) algorithm. These two methods apply
for both narrowband and wideband signals .
From [8], the first method does not work
properly when two DOAs are crossing , and
the second method requires a linear DOA
motion model, restricting DOA tracks to
only straight lines.
Recently,a statistical property,
cyclostationarity, which many type of man
made signals in communications such as
BPSK,FSK,AM exhibit has been exploited
in DOA estimation[10]-[12].By exploiting
cyclostationarity, interference and noise that
do not share the same cycle frequency as the
desired signals or do not exhibit
cyclostationarity can be suppressed ,thus
performance of DOA estimation is improved
when the DOA of interference is close to
DOA of desired signal. The
Cyclostationarity could be exploited to
improve performance of DOA tracking. All
the DOA tracking algorithms discussed
previously [1]-[7] assume that the signals
are stationary but not cyclostationary .Here
,a new signal selective DOA tracking
algorithm for wideband multiple moving
sources by exploiting the cyclostationarity
of the signals is proposed .In this algorithm ,
the signals emitted by moving sources can
be either narrowband or wideband
cyclostationary. Our algorithm assumes that
DOAs in each time frame are fixed and
tracks the DOA changes from frame to
frame by exploiting the difference of
averaged cyclic cross correlation of the
array output. DOA tracking is initiated by
applying once a wideband DOA estimation
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-9
method : averaged cyclic
MUSIC(ACM)[12]. Then , the DOA
changes for each time frame are estimated
by finding the minimum solution to an LS
cost function related to averaged cyclic cross
correlation of the array output. Similar to
[12], averaging the cyclic correlation
enables wideband application .In order to
avoid inconsistent solutions for DOA
changes when the DOAs are crossing ,the
proposed cost function also includes a
regularization term that reflect the
assumption that sources are moving at
constant speeds . Similar to [1]-[5] ,our
signal selective DOA tracking algorithm
does not require data association . Also ,
incorporation of a Kalman filter into our
signal selective DOA tracking algorithm is
presented. Via the Kalman filter, the
tracking performance of our algorithm is
improved .The effectiveness of the proposed
algorithm is demonstrated by simulations.
1. Cyclostationarity
and Data model
A. Cyclostationarity
Given a signal s(t) , the cyclic correlation is
defined as [15],
rα
ss(τ)=‹s(t+τ/2)s*(t-τ/2)e-j2παt
› (1)
where (.)* denotes complex conjugate and
‹.›denotes time average . s(t) is said to be
cyclostationary if rα
ss(τ) is not zero at some
delay τ and some cycle frequency α. Many
man made communication signals exhibit
cyclostationarity due to modulation ,periodic
gating etc. They usually have cycle
frequency at twice the carrier frequency .or
multiples of the baud rate or combination of
these. For a given signal vector z(t),we can
calculate the cyclic correlation matrix as
[10].,
Rα
zz(τ)=‹z(t+τ/2)zH(t-τ/2)e
-j2παt› (2)
Where [.]H denotes Hermitian transpose.
B. Data Model
Consider the tracking problem by a uniform
linear array with N identical elements. I
moving sources are assumed to generate I
signals with cycle frequency α impinging on
the array. These signals are considered as
signal of interest(SOI). Other signals from
other moving sources which do not exhibit
cyclostationarity or have different cycle
frequencies are considered as interference.
Take the first antenna as reference ,then the
signal received by the nth antenna in the
array is ,
Zn(t)=Σi=1I
si(t +(n-1)Δi(t))ej2πfo(n-1)Δi(t) +
ηn(t)
(3)
Where si(t) is the complex baseband signal
of the ith signal of interest (SOI) induced at
the first antenna, fo is the carrier frequency
and Δi(t)=dsinθi(t)/c is the time delay
between two adjacent antennas. Here, θi(t) is
the impinging direction of the ith SOI at
time t, d is the intersensor spacing of the
uniform linear array, c is the propagation
speed. Note that ηn(t) has two components
:interference and noise induced at the nth
antenna. Interference and noise are assumed
to be cyclically uncorrelated with SOI.
Therefore, ηn(t) is neglected.
Now, assume that the DOA of the sources
change little during the time frame of length
T i.e θi(t) or Δi(t) are constant during the
kth time frame [(k-1)T,KT] where k=1,..,K.
The total tracking time is assumed to be KT
seconds. We have Δi(k)=dsinθi(k)/c for the
kth time frame. Our tracking algorithm will
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-9
deal with the data samples collected during a
time frame.
rα
sisj (τ,k)= ∫k si(t + τ/2) sj*(t- τ/2) )e
-j2παtdt
(4)
Now let us define the following vectors and
matrices,
S(t)=[s1(t)……sI(t)]T
(5)
Z(t)=[z1(t)…zN(t)]T
(6)
A(f,k)=[a1(f,k),………aI(f,k)] (7)
ai(f,k) = [1,ej2πfΔi(k)
,., ej2πf(N-1)Δi(k)
]T
(8)
where [.]T
denotes matrix transpose ,s(t) is
the source signal vector ,z(t) is the received
signal vector, A(f, k) is the steering matrix
evaluated at the frequency f for the kth time
frame and ai(f,k)is the steering vector for the
ith SOI evaluated at f for the kth time frame.
2. LS Tracking
algorithm
We will first evaluate the averaged cross
cyclic correlations between signals received
at the first antenna and other antennas
during the kth time frame .These
correlations will be simplified as functions
of signal directions at this time frame .Based
on these functions, an LS method of tracking
the direction of sources will be discussed.
A. Averaged Cross–Cyclic correlation
and initial DOA estimation
For the kth time frame ,calculate the cross-
cyclic correlation of z1(t) and zn(t), where
zn(t) is the signal received at the nth
antenna., for n=2,..,N .Using (3) and (4) ,
we can obtain N-1 cross-cyclic correlations
estimated at the kth time frame,
rα
z1zn (τ,k) =∫kz1(t+τ/2)zn*(t- τ/2)) e
-j2παtdt
= Σi=1 I [ Σp=1
I r
αspsi (τ-(n-1) Δi(k),k)]. E
-
j2π(fo-(α/2))(n-1)Δi(k) (9)
Since evaluation of cyclic correlation will
only retain SOI, interference and noise are
ignored in (9).To eliminate the dependence
of rα
spsi (τ,k) on Δi(k) or on τ, we further
evaluate of rα
z1zn (τ,k)at different time delay
τ and average them to obtain an averaged
cross cyclic correlation between z1(t) and )
zn(t) at the kth time frame as,
‹rα
z1zn(k)›τ = Στ= τ1τ=τ2
rα
z1zn (τ,k)
=Σi=1I
[Σp=1I
‹ r
αspsi (k)›τ ]. e
-j2π(fo-
(α/2))(n-1)Δi(k) (10)
‹rα
spsi(k)›τ= Στ= τ1τ=τ2
rα
spsi (τ-(n-1) Δi(k),k)]
(11)
Thus for source signals ,si(t) ,
i=1,….,I,which normally have certain time
invariant characteristics, if the duration of a
time frame is long enough , then ‹rα
spsi(k)›τ
can be assumed to be independent of k. In
our simulations ,a 0.5 s time frame or 3200
snapshots of data samples give results. We
drop k and define ,
Ei=Σp=1I‹r
αspsi›τ (12)
In addition define,
gn(θ)=e-j2π(fo-(α/2))(n-1)dsinθ/c
(13)
Then, (10) can be written as ,
‹rα
z1zn(k)›τ=Σi=1IEign(θi(k)). (14)
To derive our algorithm we need to know Ei.
. First,we apply the signal selective DOA
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-9
estimation algorithm ACM[12] to estimate
the initial DOAs. The number of sources
emitting SOI are assumed to be known or
estimated by minimum description
length(MDL) criteria. ACM works for both
narrowband and wideband signals. A
summary of this algorithm is given below:
1. Estimate the cyclic correlation
matrix Rα
zz(τ,1) during the first time
frame.
2. Average Rα
zz(τ,1 )over τ.
3. Apply the singular value
decomposition (SVD) to Rα
zz(1 )τ
to estimate all DOAs of SOI for first
time frame i.e . θi(1) where i= 1,..,I
4. Obtain Δi(1)= dsinθi(1)/c .we have,
‹Rα
zz(1)›τ=A(fo+α/2,1)AH(fo-α/2,1)
‹Rα
ss(1)›τ (15)
‹Rα
ss(1)›τ=A(fo+α/2,1)† A
H(fo-α/2,1)
†
‹Rα
zz(1)›τ (16)
B. Recursive Direction Updating
The tracking algorithm can be developed as
follows.
θi(k)=θi(k-1)+θi~(k) (17)
gn(θi(k))=gn(θi(k-1))+∂ gn(θ )/ ∂θ│θ=θi(k-1)
θi~(k) (18)
∂gn(θ)/∂θ│θ=θi(k-1)=[-j2π(fo-α/2(n-1)dcosθi(k-
1)/c] (19)
‹rα
z1zn(k)›τ = ‹rα
z1zn(k-1)›τ + Σi=1Icn,i(k-1)
θi~(k) (20)
cn,i(k-1)=Ei∂gn(θ)/∂θ│θ=θi(k-1) (21)
rn(k)=‹rα
z1zn(k)›τ-‹rα
z1zn(k-1)›τ (22)
From (20) rn(k) can be written as,
rn(k)=[cn,1(k-1),..,cn,I(k-1)]Θ~(k) 23)
Θ~(k)=[θ1
~(k),…….,θI
~(k)]
T (24)
Stacking rn(k) for n=2,….,N, we obtain,
r(k)=[r2(k),..,rN(k)]T=C(k-1)Θ
~(k) (25)
The DOA changes Θ~(k) can be estimated
by solving the LS problem of,
r(k)=C^(k-1)Θ
~(k) (26)
Θ^(k)=Θ
^(k-1)+Θ
~(k) (27)
Θ~(k)=Θ
~(k-1) (28)
Now, define a revised LS cost function,
f(Θ~(k))=[C
^(k-1)[Θ
~(k) – r(k)]
H[C
^(k-
1)Θ~(k) – r(k)]+ [ Θ
~(k) -Θ
~(k-1)]
H۸(k)
[ Θ~(k) -Θ
~(k-1)] (29)
Θ~(k)=[C
^H(k-1)C
^(k-1)+۸(k)]
-1[C
^H(k-
1)r(k)+۸(k)Θ~(k-1)] (30)
The computational complexity of the LS
tracking algorithm is O(NI),O(N,Ns,Na),
O(I2N) and O(I).
3. Kalman filter
In this section,we introduce a source
movement model ad apply a Kalman filter to
track the DOAs. The estimated DOAs by the
LS method are viewed as measurements of
DOAs in the Kalman filter model. The
current DOAs of the sources are first
predicted from previous DOAs using the
source movement model. Then , the
predicted DOAs are refined by the Kalman
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-9
filter. Our simulation shows that Kalman
filter refinement further improves DOA
tracking accuracy and reduces the burden of
selecting optimum ۸(k) in(30).
Define the state of the ith(i=1,…,I) source at
the kth time frame as,
xi(k) = [ θi(k) ]
[ θi˙(k) ]
[θi˙˙(k)] (31)
xi(k)=Fxi(k-1)+wi(k) (32)
yi(k)=Hxi(k)+vi(k) (33)
F = [ 1 T T2/2 ]
[ 0 1 T ]
[001] (34)
E[wi(j) wiH
(k)] = { Qi(k) , j=k }
{ 0, j ≠ k } for
i=1,…,I (35)
H=[100] (36)
ei(k)=xi^(k│k)-Fxi
^(k-1│k-1) (37)
εi(k)=θi^(k)–Hxi
^(k│k-1) (38)
Since both process noise and measurement
noise are assumed to be zero mean ,their
variance can be estimated by,
Qi^(k)=1/LΣj=k-L+1
kei(j)ei
H(j) (39)
σ2
yi(k)=1/LΣj=k-L+1kεi(j)εi
*(j) (40)
The steps to estimate DOAs for the kth time
frame are as follows:
1. Obtain the predicted state by xi^(k│k-1)
= F xi^(k-1│k-1).
2. Obtain θi^(k) by LS tracking method.
Use θi^(k-1│k-1) in place of θi
^(k-1).
3. Obtain Qi^(k-1) and σ
2yi (k) from (39) &
(40).Use Qi^(k-1)as an approximation of
Qi^(k).
4. Calculate Pi^(k│k-1)= F Pi
^(k-1│k-1)F
H
+ Qi^(k).
5. Calculate the Kalman filter gain G(k)=
Pi^(k│k-1) H
H/R(k) where
R(k)= H Pi^(k│k-1) H
H + σ
2yi (k).
6. Update the state for the kth time frame
by xi^(k│k)= xi
^(k│k-1)+ G(k)( θi
^(k) - H
xi^(k│k-1)).
7. Take the first element of xi^(k│k ) as
the refined DOA estimate for the kth time
frame, θi^(k│k) .
8. Prepare the next recursion by calculating
Pi^(k│k)= Pi
^(k│k-1) – G(k)H Pi
^(k│k-1).
4. Simulations
Tracking performance versus SNR.
In this simulation, three sources are assumed
to emit three wideband BPSK signals with
raised cosine pulse shaping. Two of them
are SOI with same baud rate 20 MHz and a
same carrier frequency 100 MHz. The other
is interference with a baud rate 6 MHz and a
carrier frequency 80 MHz. The cycle
frequency of SOI is 20 MHz, which is
assumed to be known. The two SOI are
coherent. A ULA with 7 antennas with
equal spacing of c/(2fo+α)= 1.36 m is used.
The subarray size is 6 for SS during
initialization .The duration of each time
frame is 0.5s during which 3200 snapshots
of data samples are obtained. The SNR of
one SOI is 1 db lower than other. The SNR
of the interference is 5 db lower than the
higher powered SOI. To see how the
performances of the LS method and the
Kalman filter method change with SNR, we
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-9
vary the SNR of high powered SOI from -5
db to 15 db.
Generally , source crossing poses difficulty
for tracking algorithm. The tracking
algorithm fails if the estimation error is so
large that the tracks of two crossing sources
are switched and lost as shown in fig 1. We
define failure rate as the ratio of number of
failed trials to the total number of trials
,which is 40 in our estimation.Fig2 shows
the failure rates of LS algorithm and Kalman
filter algorithm with respect to SNR.We can
see with the usage of a Kalman filter,failure
rate is lower than that with the LS method
and at and above 5 db SNR ,Kalman filter
method does not fail at all.
In this simulation ,we also plot the rms error
of the estimated DOA in fig 3. Consider
aspecific value of SNR; we can calculate
mean squared error of the estimated DOAs
for each trial of LS algorithm or Kalman
filter algorithm. Then, the root of the mean
of the mse obtained through all 40 trials is
what we call rms of estimated DOAs at this
certain SNR. We should note that if the
algorithm fails to track the sources at one
trial ,the mse for that trial will be large,it is
excluded from calculating the final rms. If
we ignore this value by not considering the
failed trial ,the final rms will tend to be
smaller than true value, not reflecting the
tracking failure.From fig3 we see that
Kalman filter method performs better than
the LS method.
Comparison of the estimated tracks with
real tracks of sources
In this simulation, we look into to see how
well the LS method and Kalman filter
method track the targets . The signals and
settings are same as in first simulation
except that SNR for both SOI are the same
and there is one more interference with a
baud rate of 6MHz and a carrier frequency
100MHz.whose SNR is also 5 db lower than
that of SOI.
We first assume that the SNR of the SOI is 5
db and runs both the LS method and Kalman
filter method 40 times. We assume that SNR
of SOI is 15 db and runs these two tracking
methods both for 40 times again. We plot
ensemble averages of estimated DOAs by
the LS method when SNR is 5 db in fig4.
Three other plots for the mean of the
estimated DOAs by the LS method when
SNR is 15 db and by Kalman filter
methodwhen SNR is 5 db and 15 dbare
similar and hence omitted. The comparisons
of the rms errors of the estimated DOAs by
our two algorithms is illustrated in fig 5 and
fig6 .for one SOI. It can be seen from these
plots that both methods track the DOAs of
the SOI well with Kalman filter method
outperforming the LS method in accuracy.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-9
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-9
5. References
[1] C.R.Sastry,E.W.Kamen “An efficient
algorithm for tracking the angles of arrival
of moving targets” IEEE Trans. In signal
processing vol 39,no1,pp242-246,Jan 1991.
[2] C.K.Sword,M.Simaan and E.W.Kamen,
“ Multiple target angle tracking using sensor
array outputs”, IEEE Trans in Aerospace
Electronic Sys.,vol26,no2,pp 367-
373,March 1990.
[3] S.B.Park,C.S.Ryu,and K.K.Lee, “
Multiple target angle tracking algorithm
using predicted angles”,IEEE Trans in
Aerospace Electronic Sys.,vol 30 ,no 2,
pp643-648,April 1994.
[4] C.R.Rao,C.R.Sastry and B.Zhou , “
Tracking the direction of arrival of multiple
moving targets”, IEEE Trans in signal
processing vol 42, no.5,pp1133-1144,May
1994.
[5] Y.Zhou,P.C.Yip and H.Leung, “
Tracking the direction of arrival of multiple
moving targets by passive
arrays:algorithms”,IEEE Trans in signal
processing vol 47, no10, pp 2655-2666, Oct
1999.
[6] M.Cho and J.Chun, “ Updating the
focusing matrix for direction of arrival
estimation of moving sources”,in Proc Nat
Aero Electron Confer.Oct 2000, pp 723-727.
[7] A.Sathish and R.L.Kashyap , “
Wideband multiple target tracking”, in proc
IEEE Int Conf Acoustic,Speech,Signal
processing, April 2004,vol4,pp517-520.
[8] P.J.Chung,J.F.Bohme and A.O.Hero,
“Tracking of multiple moving sources using
recursive EM algorithms”, EURASIP J
applied signal processing vol 2005,no1,pp
50-60,2005.
[9] R.O. Schimdt, “ Multiple emitter
location and signal parameter estimation”,
IEEE Trans Antennas Propagation,volap-
34,no 3, pp 276-280,March 1986.
[10] W.A.Gardner , “ Simplification of
MUSIC and ESPIRIT by exploitation of
cyclostationarity”,Proc IEEE vol 76, no 7 pp
845-847, July 1988.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-9
Bartlett Windowed fast computation of
discrete trigonometric transforms for real-time
data processing Abhijit Khare, Shubham Varshney, Vikram Karwal
{khareabhijit14, shubham7502909dece}@gmail.com, [email protected]
Department of Electronics and Communication
Jaypee Institute of Information Technology, Noida, India
Abstract- Discrete trigonometric transforms (DTT)
namely discrete cosine transform (DCT) and discrete
sine transform (DST) are widely used transforms in
image compression applications. Numerous fast
algorithms for rapid processing of real time data exist
in theory. Windowing is a technique where a portion
of the signal is extracted and its transform is
computed. These algorithms form a class of fast
update transform that uses less computation as
compared to computing transform using conventional
definition. Different windows such as rectangular,
split-triangular and sinusoidal windows have been
used in theory to sample the real time sequence and
their performance compared. In this research fast
update algorithm are analytically derived that are
capable of windowing the real time data in presence
of Bartlett window. Initially simultaneous update
algorithms are analytically derived and thereafter
algorithms capable of independently updating DCT
and DST are derived i.e. while computing the DCT
updated coefficients no DST coefficients are required
and vice-versa. The analytically derived algorithms
are implemented in C language to test their
correctness.
Keywords— Discrete trigonometric transform, window, fast update
I. INTRODUCTION
In the area of signal processing, transform
coding [8] provides an efficient way for
transmitting and storing data. The input data
sequence is divided into suitably sized blocks and thereafter reversible linear transforms are
performed. The transformed sequence has much
lower degree of redundancy than in the original
signal. Karhunen-Loéve Transform (KLT) [3] has
emerged as a benchmark for Markov-1 type
signals. The Discrete Cosine Transform (DCT)
[4,7] and the Discrete Sine Transform (DST)
perform quite closely to the ideal KLT and have
emerged as the practical alternatives to the ideal
KLT.
The DCT and DST have wide applications in signal and image processing for the purposes of
pattern recognition, data compression,
communication and several other areas [5]. Due to
their powerful bandwidth reduction capability the
DCT and DST algorithms are widely used for data
compression. DCT transforms a signal or image from the spatial domain to the frequency domain,
where much of the energy lies in the lower
frequencies coefficients like Discrete Fourier
Transform (DFT). The main advantage of the DCT
over the DFT is that DCT involves only real
multiplications. The DCT does a better job of
concentrating energy into lower order coefficients
than the DFT for image data. The DCT is adopted
as a standard technique for image compression in
JPEG and MPEG standards because of its energy
compaction property.
A portion of input signal is extracted using
windowing [6] and the transform of the windowed
contents is computed. These classes of algorithms
already exist in theory and are known as fast update
algorithms [2]. Different windows such as
rectangular, split-triangular, Hamming, Hanning
and Blackman windows have been used earlier to
sample the real time data and their performance
compared [6]. In this paper we have developed
update algorithm in the presence of Bartlett
window. Initially the algorithms are derived for simultaneous update of DCT/ DST coefficients, i.e.
we require to compute both the DCT and the DST
coefficients to find the updated DCT/ DST
coefficients. Thereafter algorithms are derived that
establish independence [1] between the DCT and
DST coefficients. These algorithms lead to easier
implementation of the update transform as we do
not need to compute both the coefficients
simultaneously.
Section I lists the introduction of Discrete
trigonometric transforms, windowed update algorithms and their advantages. Section II lists the
Bartlett window and DTT definitions.
Simultaneous Bartlett windowed update algorithms
are also derived in Section II. In Section III
independent update algorithms are derived. Section
IV includes the complexity calculations of the
derived algorithms and section V concludes the
paper.
II. DCT/DST TYPE-II WINDOWED
SIMULTANEOUS UPDATE ALGORITHMS
USING BARTLETT WINDOW
A. Basic algorithms for DCT and DST
The DCT of a signal f(x) of length N is defined
by
𝐶 𝑘 = 2
𝑁𝑃𝑘 𝑓 𝑥 𝑐𝑜𝑠
2𝑥 + 1 𝑘𝜋
2𝑁
𝑁−1
𝑥=0
(1)
for k=0,1....,N-1
where,
1
2 𝑖𝑓 𝑘 𝑚𝑜𝑑 𝑁 = 0
𝑃𝑘= 1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
The DST of the same can be written as:
𝑆 𝑘 = 2
𝑁𝑃𝑘 𝑓 𝑥 𝑠𝑖𝑛
2𝑥 + 1 𝑘𝜋
2𝑁
𝑁−1
𝑥=0
(2)
for k=1,2,...,N
B. Simultaneous Update Algorithm
This section lists the update equations for
Bartlett windowed update DCT/DST. The
windowed update algorithms for DCT/DST type-II [9] are derived as it is the most often used
transform. For the input signal f(x), x=0,1,.... ..,N-1,
and the Bartlett window w(x) of length N with tail-
length N/2 given by equation (4), the windowed
data is given by
𝑓𝑤 𝑥 = 𝑓 𝑥 𝑤 𝑥 (3)
Bartlett (or triangular) window of length N is
defined by:
2𝑥
𝑁 𝑓𝑜𝑟 𝑥 = 0,1, …… , 𝑁
w(x)=
𝑤 𝑁 − 𝑥 𝑓𝑜𝑟 𝑥 =𝑁
2+ 1, …… , 𝑁 − 1 (4)
When the new data point f(N) is available, f(0)
is shifted out and f(N) data point is shifted in. The updated sequence is represented by f(x+1) and the
shifted windowed data is given by:
𝑓𝑤(𝑛𝑒𝑤 )(𝑥) = 𝑓(𝑥 + 1)𝑤(𝑥) (5)
which can be rewritten as:
𝑓𝑤(𝑛𝑒𝑤 )(𝑥) = 𝑓(𝑥 + 1) 𝑤 𝑥 + 𝑤 𝑥 + 1 − 𝑤(𝑥 + 1)
𝑓𝑤(𝑛𝑒𝑤 )(𝑥) = 𝑓 𝑥 + 1 𝑤 𝑥 + 1
+𝑓 𝑥 + 1 𝑤 𝑥 − 𝑤(𝑥 + 1)
1
0 N/2 X
Fig. 1 Bartlett Window w(x)
Defining 𝑚(𝑥) = 𝑤(𝑥) −𝑤(𝑥 + 1), above equation can be written as:
𝑓𝑤 𝑛𝑒𝑤 (𝑥) = 𝑓 𝑥 + 1 𝑤 𝑥 + 1 + 𝑓 𝑥 + 1 𝑚(𝑥)
Therefore,
𝑓𝑤 𝑛𝑒𝑤 𝑥 = 𝑓𝑤 𝑥 + 1 + 𝑓𝑚 𝑛𝑒𝑤 𝑥 (6)
𝑓𝑜𝑟 𝑥 = 0, …… , 𝑁 − 1
Where,
𝑓𝑚 𝑛𝑒𝑤 𝑥 = 𝑓 𝑥 + 1 𝑚(𝑥)
and
𝑓𝑤 𝑥 + 1 = 𝑓 𝑥 + 1 𝑤(𝑥 + 1)
𝑓𝑚 𝑛𝑒𝑤 𝑥 can be rewritten as:
𝑓𝑚 (𝑛𝑒𝑤 )(𝑥) = 𝑓(𝑥 + 1) 𝑚 𝑥 + 1 − 𝑚 𝑥 + 1 + 𝑚 𝑥
i.e.,
𝑓𝑚 𝑛𝑒𝑤 𝑥 = 𝑓 𝑥 + 1 𝑚 𝑥 + 1
+𝑓 𝑥 + 1 𝑚 𝑥 −𝑚 𝑥 + 1 (7)
Now,
−4
𝑁 𝑖𝑓 𝑥 =
𝑁
2− 1
m(x)-m(x+1)= 4
𝑁 𝑖𝑓 𝑥 = 𝑁 − 1
0 all other x in 0,.....,N-1
𝑚 𝑥 −𝑚 𝑥 + 1 = −4
𝑁𝛿𝑥 ,
𝑁2−1
+4
𝑁𝛿𝑥 ,𝑁−1 (8)
Substituting the value of m(x)-m(x+1) from
equation (8) to equation (7), we get
𝑓𝑚 𝑛𝑒𝑤 (𝑥) = 𝑓𝑚 𝑥 + 1
+𝑓 𝑥 + 1 −4
𝑁𝛿𝑥 ,
𝑁
2−1
+4
𝑁𝛿𝑥 ,𝑁−1
𝑓𝑚 𝑛𝑒𝑤 (𝑥) = 𝑓𝑚 𝑥 + 1
+4
𝑁 −𝑓
𝑁
2 𝛿
𝑥 ,𝑁
2−1
+ 𝑓 𝑁 𝛿𝑥 ,𝑁−1 (9)
The windowed update version of fw(x) and
fm(x) for moving DCT/DST for Bartlett window is
represented by equations (6) and (9) respectively. In
equation (6), fw(x+1) represents non-windowed
update of fw(x) and the second term fm(new)(x) is a
correction factor that converts this non-windowed update of fw(x) into an update in the presence of the
window. Similarly in equation (9), fm(x+1)
represents non-windowed update of fm(x) and the
second term converts this into the update in the
presence of the window.
Taking DCT-II of equation (6) and equation (9)
yields:
𝐶𝑤 𝑛𝑒𝑤 𝑥 = 𝐶𝑤 𝑥 + 1 + 𝐶𝑚 𝑛𝑒𝑤 𝑥 (10)
𝐶𝑚(𝑛𝑒𝑤 ) = 𝐶𝑚 𝑥 + 1
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2 𝛿
𝑥 ,𝑁2−1
𝑁−1
𝑥=0
+ 𝑓(𝑁)𝛿𝑥 ,𝑁−1 𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋
2𝑁
Solving the above equation yields:
𝐶𝑚(𝑛𝑒𝑤 ) = 𝐶𝑚 𝑥 + 1
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2 𝑐𝑜𝑠
2(𝑁2 − 1) + 1 𝑘𝜋
2𝑁
𝑁−1
𝑥=0
+ 𝑓(𝑁)𝑐𝑜𝑠 2(𝑁 − 1) + 1 𝑘𝜋
2𝑁
𝐶𝑚 𝑛𝑒𝑤 = 𝐶𝑚 𝑥 + 1
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2 𝑐𝑜𝑠
𝑁 − 1 𝑘𝜋
2𝑁
𝑁−1
𝑥=0
+ 𝑓(𝑁)𝑐𝑜𝑠 2(𝑁 − 1) + 1 𝑘𝜋
2𝑁
Therefore,
𝐶𝑚 𝑛𝑒𝑤 = 𝐶𝑚 𝑥 + 1
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2 𝑐𝑜𝑠
𝑁 − 1 𝑘𝜋
2𝑁
𝑁−1
𝑥=0
+ 𝑓(𝑁)(−1)𝑘𝑐𝑜𝑠𝑘𝜋
2𝑁 (11)
𝑓𝑜𝑟 𝑘 = 0,…… , 𝑁 − 1
Equations (10) and (11) can be used to
calculate the simultaneous update of the moving
DCT for Bartlett window. Cw(x+1) is the non-
windowed DCT update of fw(x) calculated using
DCT simultaneous update equation for rectangular
window which is listed below [2], and Cm(x+1) is
the non-windowed DCT update of fm(x) calculated
using same equation. Clearly, it can be seen that
while performing the windowed DCT update, both
the coefficients of DCT and DST are required.
𝐶+ 𝑘 = 𝑐𝑜𝑠𝑟𝑘𝜋
𝑁𝐶 𝑘 + 𝑠𝑖𝑛
𝑟𝑘𝜋
𝑁𝑆(𝑘)
+ 2
𝑁𝑃𝑘 −1 𝑘𝑓 𝑁 + 𝑟 − 1 − 𝑥
𝑁−1
𝑥=0
− 𝑓(𝑟 − 1 − 𝑥) 𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋
2𝑁
𝑓𝑜𝑟 𝑘 = 0, …… , 𝑁 − 1
where, C+(k) represents the updated DCT
coefficients.
Similarly the DST update equation may be derived
and is:
𝑆𝑤 𝑛𝑒𝑤 𝑥 = 𝑆𝑤 𝑥 + 1 + 𝑆𝑚 𝑛𝑒𝑤 𝑥 (12)
𝑆𝑚 (𝑛𝑒𝑤 ) = 𝑆𝑚 𝑥 + 1
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2 𝑠𝑖𝑛
𝑁 − 1 𝑘𝜋
2𝑁
𝑁−1
𝑥=0
+ 𝑓(𝑁)(−1)𝑘𝑠𝑖𝑛𝑘𝜋
2𝑁 (13)
𝑓𝑜𝑟 𝑘 = 0, …… , 𝑁 − 1
Equations (12) and (13) can be used to
calculate the simultaneous update of the moving
DST for Bartlett window. Sw(x+1) is the non-
windowed DST update of fw(x) calculated using
DST update equation for rectangular window
which is listed below [2], and Sm(x+1) is the non-
windowed updated DST of fm(x) calculated using
the same equation. Clearly, it can be seen that
while performing the windowed DST update both the coefficients of DST and DCT are required.
𝑆+ 𝑘 = 𝑐𝑜𝑠𝑟𝑘𝜋
𝑁𝑆 𝑘 − 𝑠𝑖𝑛
𝑟𝑘𝜋
𝑁𝐶(𝑘)
+ 2
𝑁𝑃𝑘 −1 𝑘𝑓 𝑁 + 𝑟 − 1 − 𝑥
𝑁−1
𝑥=0
− 𝑓(𝑟 − 1 − 𝑥) 𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋
2𝑁
where, S+(k) represents the updated DST
coefficients.
III. DCT/DST TYPE-II WINDOWED INDEPENDENT
UPDATE ALGORITHMS USING BARTLETT
WINDOW
A. Independent Update Algorithm
Above mentioned equations (10) and (11) can
be used to calculate the independent update of the
moving DCT-II for Bartlett window. Cw(x+1) is
the non-windowed DCT-II update of fw(x), using
DCT independent update equation for rectangular
window which is listed below [2], and Cm(x+1) is
the non-windowed DCT-II update of fm(x) also
calculated using the same equation.
𝐶𝑤 𝑛 + 𝑟, 𝑘 = 2𝑐𝑜𝑠𝑟𝑘𝜋
𝑁𝐶 𝑛. 𝑘 − 𝐶 𝑛 − 𝑟, 𝑘
+ 2
𝑁𝑃𝑘𝑠𝑖𝑛
𝑟𝑘𝜋
𝑁 [𝑓 𝑛 −𝑁 − 𝑥 − 1
𝑟−1
𝑥=0
− −1 𝑘𝑓(𝑛 − 𝑥 − 1)] 𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋
2𝑁
+ 2
𝑁𝑃𝑘𝑠𝑖𝑛
𝑟𝑘𝜋
𝑁 [
𝑟−1
𝑥=0
−1 𝑘𝑓 𝑛 + 𝑟 − 𝑥 − 1
−𝑓 𝑛 + 𝑟 − 𝑁 − 𝑥 − 1 ]𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋
2𝑁
− 2
𝑁𝑃𝑘𝑐𝑜𝑠
𝑟𝑘𝜋
𝑁 [ −1 𝑘𝑓 𝑛 − 𝑥 − 1
𝑟−1
𝑥=0
−𝑓 𝑛 −𝑁 − 𝑥 − 1 ]𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋
2𝑁
for k=0,1,......,N-1
When using the above equation to
calculate the non-windowed update we need the
current value C(n,k) and the previous value C(n-
1,k). The current and previous values in the case of
Cw are 𝐶 𝑓 𝑥 𝑤(𝑥) and 𝐶 𝑓 𝑥−1 𝑤(𝑥−1) respectively.
Since, the value of 𝐶 𝑓 𝑥−1 𝑤(𝑥−1) is not yet
available we need to derive it from𝐶 𝑓 𝑥−1 𝑤(𝑥)
which is available from the previous step. Similarly
for Cm, we need to calculate the correction factor
to compute 𝐶 𝑓 𝑥−1 𝑚 (𝑥−1) from 𝐶 𝑓 𝑥−1 𝑚 (𝑥) .
Similarly the analogous formulae for
DST-II are obtained by taking DST-II of equations
(6) and (9):
𝑆𝑤 𝑛𝑒𝑤 𝑥 = 𝑆𝑤 𝑥 + 1 + 𝑆𝑚 𝑛𝑒𝑤 𝑥 (14)
𝑆𝑚 (𝑛𝑒𝑤 ) = 𝑆𝑚 𝑥 + 1
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2 𝑠𝑖𝑛
𝑁 − 1 𝑘𝜋
2𝑁
𝑁−1
𝑥=0
+ 𝑓(𝑁)(−1)𝑘𝑠𝑖𝑛𝑘𝜋
2𝑁 (15)
𝑓𝑜𝑟 𝑘 = 0,…… , 𝑁 − 1
Equations (14) and (15) can be used to
calculate the independent update of the moving
DST-II for Bartlett window. Sw(x+1) is the non-
windowed DST-II update of fw(x), using DST
independent update equation for rectangular
window which is listed below [2], and Sm(x+1) is
the non-windowed DST-II update of fm(x) also
calculated using the same equation.
𝑆𝑤 𝑛 + 𝑟, 𝑘 = 2𝑐𝑜𝑠𝑟𝑘𝜋
𝑁𝑆 𝑛.𝑘 − 𝑆 𝑛 − 𝑟, 𝑘
+ 2
𝑁𝑃𝑘𝑠𝑖𝑛
𝑟𝑘𝜋
𝑁 [𝑓 𝑛 −𝑁 − 𝑥 − 1
𝑟−1
𝑥=0
− −1 𝑘𝑓(𝑛 − 𝑥 − 1)] 𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋
2𝑁
+ 2
𝑁𝑃𝑘𝑠𝑖𝑛
𝑟𝑘𝜋
𝑁 [
𝑟−1
𝑥=0
−1 𝑘𝑓 𝑛 + 𝑟 − 𝑥 − 1
−𝑓 𝑛 + 𝑟 − 𝑁 − 𝑥 − 1 ]𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋
2𝑁
− 2
𝑁𝑃𝑘𝑐𝑜𝑠
𝑟𝑘𝜋
𝑁 [𝑓 𝑛 −𝑁 − 𝑥 − 1
𝑟−1
𝑥=0
− −1 𝑘𝑓 𝑛 − 𝑥 − 1 ]𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋
2𝑁
for k=1,......,N
When using the above equation to
calculate the non-windowed update we need the
current value S(n,k) and the previous value S(n-
1,k). The current and previous values in the case of
Sw are 𝑆 𝑓 𝑥 𝑤(𝑥) and 𝑆 𝑓 𝑥−1 𝑤(𝑥−1) respectively.
Since, the value of 𝑆 𝑓 𝑥−1 𝑤(𝑥−1) is not yet
available we need to derive it from𝑆 𝑓 𝑥−1 𝑤(𝑥)
which is available from the previous step. Similarly
for Sm, we need to calculate the correction factor to
compute 𝑆 𝑓 𝑥−1 𝑚 (𝑥−1) from 𝑆 𝑓 𝑥−1 𝑚 (𝑥) .
B. Computation for oldest time-step
The correction factor to calculate the
correct value C[f(x-1)w(x-1)] from C[f(x-1)w(x)]
for DCT update algorithm, and the correct value of
S[f(x-1)w(x-1)] from S[f(x-1)w(x)] are derived here
for the DST-II update algorithm.
𝑓 𝑥 − 1 𝑤 𝑥 = 𝑓(𝑥 − 1) 𝑤(𝑥) + 𝑤(𝑥 − 1) − 𝑤(𝑥 − 1)
= 𝑓 𝑥 − 1 𝑤 𝑥 − 1 − 𝑓(𝑥 − 1) 𝑤(𝑥 − 1) −𝑤(𝑥)
= 𝑓 𝑥 − 1 𝑤 𝑥 − 1 − 𝑓(𝑥 − 1)𝑚(𝑥− 1)
Therefore,
𝑓 𝑥 − 1 𝑤 𝑥 − 1 = 𝑓(𝑥 − 1)𝑤(𝑥)
+𝑓 𝑥 − 1 𝑚 𝑥 − 1 (16)
Calculating the correction factor to convert
𝑓(𝑥 − 1)𝑚(𝑥) into the correct value 𝑓(𝑥 − 1)𝑚(𝑥 − 1),
𝑓 𝑥 − 1 𝑤 𝑥 = 𝑓 𝑥 − 1 𝑤 𝑥 + 𝑤 𝑥 − 1 − 𝑤 𝑥 − 1
= 𝑓 𝑥 − 1 𝑚 𝑥 − 1 − 𝑓(𝑥 − 1) 𝑚(𝑥 − 1) −𝑚(𝑥)
= 𝑓 𝑥 − 1 𝑚 𝑥 − 1 − 𝑓(𝑥 − 1)𝑚𝑝(𝑥− 1)
Therefore;
𝑓 𝑥 − 1 𝑚 𝑥 − 1 = 𝑓 𝑥 − 1 𝑚 𝑥
+𝑓 𝑥 − 1 𝑚𝑝 𝑥 − 1 (17)
where,
−4
𝑁 𝑖𝑓 𝑥 =
𝑁
2− 1
m(x)-m(x+1)= 4
𝑁 𝑖𝑓 𝑥 = 𝑁 − 1
0 all other x in 0,.....,N-1
𝑓 𝑥 − 1 𝑚(𝑥 − 1) = 𝑓 𝑥 − 1 𝑚(𝑥)
+ 𝑓 𝑥 − 1 −4
𝑁𝛿𝑥 ,
𝑁2
+4
𝑁𝛿𝑥,0
i.e.
𝑓𝑚 𝑥 − 1 = 𝑓 𝑥 − 1 𝑚 𝑥
+4
𝑁 −𝑓
𝑁
2− 1 𝛿
𝑥 ,𝑁2
+ 𝑓 −1 𝛿𝑥,0 (18)
Taking DCT-II of equation (18)
𝐶𝑚𝑜𝑙𝑑 𝑘 = 𝐶 𝑓 𝑥−1 𝑚 𝑥−1
= 𝐶 𝑓 𝑥−1 𝑚 𝑥
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2− 1 𝛿
𝑥,𝑁2
𝑁−1
𝑥=0
+ 𝑓(−1)𝛿𝑥,0 𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋
2𝑁
for k=0,1,.....,N-1
Therefore,
𝐶𝑚𝑜𝑙𝑑 𝑘 = 𝐶 𝑓 𝑥−1 𝑚 𝑥−1
= 𝐶 𝑓 𝑥−1 𝑚 𝑥
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2− 1 𝑐𝑜𝑠
𝑁 + 1 𝑘𝜋
2𝑁
𝑁−1
𝑥=0
+ 𝑓(−1)𝑐𝑜𝑠𝑘𝜋
2𝑁 (19)
Taking DCT-II of equation (14)
𝐶 𝑓 𝑥−1 𝑤 𝑥−1 = 𝐶 𝑓 𝑥−1 𝑤 𝑥 + 𝐶 𝑓 𝑥−1 𝑚 𝑥−1 20
Equations (19) and (20) together can be used to calculate the older time sequence
windowed DCT-II values.
Taking DST-II of equation (18)
𝑆𝑚𝑜𝑙𝑑 𝑘 = 𝑆 𝑓 𝑥−1 𝑚 𝑥−1
= 𝑆 𝑓 𝑥−1 𝑚 𝑥
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2− 1 𝛿
𝑥,𝑁2
𝑁−1
𝑥=0
+ 𝑓(−1)𝛿𝑥,0 𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋
2𝑁
Therefore,
𝑆𝑚𝑜𝑙𝑑 𝑘 = 𝑆 𝑓 𝑥−1 𝑚 𝑥−1
= 𝑆 𝑓 𝑥−1 𝑚 𝑥
+ 2
𝑁𝑃𝑘
4
𝑁 −𝑓
𝑁
2− 1 𝑠𝑖𝑛
𝑁 + 1 𝑘𝜋
2𝑁
𝑁−1
𝑥=0
+ 𝑓(−1)𝑠𝑖𝑛𝑘𝜋
2𝑁 (21)
Taking DST-II of equation (16)
𝑆 𝑓 𝑥−1 𝑤 𝑥−1 = 𝑆 𝑓 𝑥−1 𝑤 𝑥 + 𝑆 𝑓 𝑥−1 𝑚 𝑥−1 (22)
Equations (21) and (22) together can be used
to calculate the older time sequence windowed DST-II values.
IV. COMPUTATIONAL COMPLEXITY
The algorithm developed is of computational
order N, whereas calculating the transform via fast
DCT/DST algorithms is of order Nlog2N.
V. CONCLUSION
New fast efficient algorithms that are capable
of updating the Bartlett windowed DCT and the
DST for a real time input data sequence are listed.
The windowed update algorithm aims at reducing the complexity in calculating DCT every time a
new value is introduced in the input. Initially
simultaneous Bartlett windowed update algorithms
for DCT/DST-II are developed and thereafter
independence is established between the update of
DCT and DST. The algorithms analytically
derived are verified using C language.
REFERENCES
[1] Karwal V, B.G. Sherlock, Y.P. Kakad, “Windowed DST-
independent discrete cosine transform for shifting data”.
Proceeding of 20th International Conference on Systems
Engineering, Coventry, U.K., Sept. 2009 pp. 252-257
[2] Karwal Vikram,” Discrete cosine transform-only and discrete
sine transform-only windowed update algorithms for shifting
data with hardware implementation,” Ph.D. Dissertation.
University of North Carolina at Charlotte, 2009, ISBN:
9781109343267.
[3] Ray W.D., Driver, R.M. “Further Decomposition of the
Karhunen-Loéve Series Representation of a Stationary Random
Process”, IEEE Trans., 1970, IT-16, pp 12-13.
[4] N. Ahmed, T. Natarajan, and K.R. Rao, "Discrete cosine
transform," IEEE Trans. Comput., vol. C-23, pp. 90-94, Jan.
1974.
[5] W.K. Pratt, Generalized Wiener "ltering computation
techniques, IEEE Trans. Comput. C-21 (July 1972) 636}641.
[6] Fedrick J. Harris, “On the Use of Windows for Harmonic
Analysis with the Discrete Fourier Transform”, Proceedings of
the IEEE, vol. 66, no. 1, January 1978
[7] P.Yip and K.R. Rao, "On the shift properties of DCT's and
DST's," IEEE Trans. Signal Processing, vol. 35, pp. 404-406,
Mar.1987.
[8] B.G. Sherlock, Y.P. Kakad, "Transform domain technique
for windowing the DCT and DST," Journal of the Franklin
Institute, vol. 339, Issue 1, pp. 111-120, April 2002.
[9] Jiantao Xi, Chicharo J.F.,” Computing running DCT’s and
DST’s based on their second order shift properties,” IEEE Trans.
On circuit and system-I, Vol. 47, No.5, 2000, pp 779-783.
[10] B.G. Sherlock, Y.P. Kakad,” Windowed discrete cosine and
sine transforms for shifting data”, Journal of signal processing,
Elsevier, Vol. (81) pp. 1465-1478.
[11] B.G. Sherlock, Y.P.Kakad, A. Shukla, “Rapid update of odd
DCT and DST for real-time signal processing,” Proc. Of SPIE
Vol. 5809 pp. 464-471. Orlando, Florida, March 2005.
L
2Plot
Abstractfilter argenerallyRecentlyperformschemesefficientscheme f
Index Tcomprespredictio
I.INTRO
B
A Bayersensors three coresultant
Fig:(1) B
Fig shocenter, inefficieintroduceventualcompresdemosaidesign computacan be computeimage co
Losslespre
Patil
1A 303No.-98, Sector
3MGM
t— In most drray images caly carried y it was comp the conve
s in terms oft reduction for Bayer filte
Terms—Bayer ssion, Greenon, Adaptive co
ODUCTION
BAYER COLO
r Filter color ain these camer
olors component image is refer
Bayer Patter ha
ows the Bayercompressed
ent in a way thece some relly be remssion step. Weicing digital c
and low ationally heavy
carried in aner. This motivompression sch
ss comedictio
Anita U1 , Dr
3 Joykung,Secr 3,Navi Mumb’s College of E
9819514
digital cameraaptured and dout before
mpression firstentional demf output imagbased lossless
er color images
Color filter an predictionolor difference
OR FILTER A
array usually cras to record onts at each pixerred to as a CF
as Red sample
r Patter has Rfor storage. e demosaicing edundancy w
moved in te do the compcameras can h
power cony process likn offline powvates the demhemes.
mpresson for r. Sudhirkuma
tor 56,Gurgaonbai (Thane) MaEngineering an4330,nareshkum
as Bayer colodemosaicing i
compressiont scheme ouosaicing firsge quality. Ans compressions proposed
array, Losslesn, Non-greene.
ARRAY
coated over thonly one of thel location. ThA image.
in center
Red sample inThen it waprocess alway
which shouldthe followingpression beforhave a simplensumption a
ke demosaicingwerful personamand of CFA
sion scbayer
ar D. Sawark
n,+91 9999860aharashtra,+91d Technology,mar.harale@mg
or is n. ut st n n
ss n
e e e
n as ys d g e
er as g al A
Fig 2: demosa
There asuch as
••
So nowmethod
•
•
•
•
A Predschemetwo sub(a) A sample(b) Noand blu This sy
••
chemer colorar2, Nareshku
0692,patilanita 9819768930 ,Kamothe, Navgmmumbai.ac
single sensoaicing and (b) C
II.PRESEN
are different ss
Lossy comprJPEG2000
w we have to lds.
Lossy schemdiscarding information.This schemcompressionlossless scheJPEG-2000 image but onattained. JPEG-2000 compress the
III. PROP
diction basede is proposed. b-images: green sub-imas of the CFA imn-green sub im
ue samples in th
ystem is mainly
Encoder Decoder
e basedr filterumar Harale3
[email protected] ,principaldmcevi Mumbai,+91.in
or camera imaCompression
NT SCHEMES
schemes presen
ression scheme
look the drawb
mes compress aits visual
me visually yn ratio as comemes.
is used to nly a fair perf
is very expene images.
POSED SCHE
d lossless CFIt divides a C
age which conmage mage which che CFA image
y consists of tw
d on r
aging chain (
USED
nt in the mark
e
backs of prese
a CFA Image blly redunda
yields a highmpared with th
encode a CFformance can b
nsive method
EME
FA compressioCFA images in
ntains all gree
contains the re.
wo parts
(a)
ket
ent
by ant
her he
FA be
to
on nto
en
ed
Encoder
Fig 3: St
Green SSubimagreferencthe nongdifferencprocessethe colosubimagscan sepredictiodependetwo susequentiof adapti
I
This prPredictioNon-gre
Predictio
As the predictioNow pronearest form a c
We cangreen pix
r:
tructure of prop
Subimage is coge follows bae and To redugreen subimace domain wed in the intensor difference
ge. Both subimequence withon technique ency. The predubimages areially with our ive Rice code.
IV. WORKING
roposed schemon on the greeneen plane.
on on the green
green plane ion and all preocessing a parprocessed neiandidate set
n find the dirxels it need som
posed scheme
oded first and ased on greenuce the spectrage is processe
whereas the gresity domain as
content of mages are proch context ma
to removediction residuee then entrproposed real
G OF THE SC
me is mainlyn plane and Pr
n plane
is raster scannediction errorsrticular green ighboring sam
rections assocme process.
the Non greenn subimage aral redundancyed in the colo
een subimage ia reference fothe nongreen
cessed in rasteatching basede the spatiae planes of thropy encodedization schem
CHEME
y working onrediction on th
ned during ths are recordedplane the fou
mples of g (i,j
iated with th
n as y, or is or n
er d al e d e
n e
e d. ur j)
e
Fig 4: green p
Let g(mranked Sg(mu,1<=u<=
If the directiowill bepredicti
i.e. {wheterog
i.e. {w1
FLOW GREEN
Adaptivplane When cdifferencolor spLet c(msamplincolor di d(m,n)g’(m,n)value
Four possiblpixel
mk,nk)Є Φg(i,candidates
,nu)) <= D=v<=4
directions oons of all greene considered ion of g(i,j) is
1,w2,w3,w4}=genous region a
1,w2,w3,w4}=
CHART FON PLANE
ve color differ
compressing thnce informatiopectral dependm,n) be the inng position(mifference of pix)=g’(m,n)-c(m,) à estimate
le directions
,j) for k=1,2,of sample g
D(Sg(i,j), Sg(
of g(i,j) is idn samples in Sin a homogen
={1,0,0,0} Elsand predicted v
={5/8,2/8,1/8,0
OR PREDICT
rence estimatio
he nongreen coon is exploitedency.
ntensity value m,n). Green-Rxel (m,n) is ,n) d green comp
associated wi
,3,4 be the fog(i,j) Э(Sg(i,jmv,nv) ) f
dentical to thSg(i,j), pixel (inous region an
se the g(i,j) is value of g(i,j) i
}
TION ON TH
on for non gree
olor plane, cold to remove th
at a non greeRed(Green-Blu
ponent intensi
ith
our j),
for
he ,j) nd
in is
HE
en
lor he
en ue)
ity
GH=(g(mGv=(g(m
Predictio
Color dic(i,j) wit
Where {Where kranked c CompresThe preimage, s
m,n-1)+g(m,n+m+1,n)+g(m+1
on on the non g
ifference predith color differe
{w1, w2, w3, wk is predictor candidate in Φc
ssion scheme ediction Error ay e(i, j) is giv
+1))/2 and ,n))/2
green plane
iction of a nonence value d(i,j
w4}={4/8, 2/8, coefficient d
c(i,j)
of pixel (i, jven by
n green samplj) is
1/8, 1/8} d(mk,nk) is kth
j) in the CFA
e
h
A
where gsample (i, j)
The enonnegdistribuone
The E(scannedis emplsimplicexponeis usedQuotien
R Where QuotienstorageThe Lej) is k d
Parameperformj) Optima
Where For a gcolor spand, hewhole sΜ is es
defined
g(i, j), d(i, j) avalue and the
error residue gative integer aution to an exp
(i, j) ’s from thd and coded wloyed to cocity and higentially distribd, each mappednt Q
parameter knt and Rem
e and transmissength of code wdependent and
eter k is cmance as it dete
al parameter K
geometric sourpaces I As lonence, the optimsource can be dstimated adapti
When codind to be
are respectivelycolor differenc
e(i, j) is thenas follows to rponential one f
he green sub-iwith Rice code ode E(i, j) bgh efficiencyuted sources W
d Residue E(i,
k is a non nainder are th
sion. word used for ris given by
ritical to thermines the co
is given by
is the goldrce with distribng as is μ knowmal coding pardetermined easvely in course
ng E(i, j) of
y, the real greece value of pix
n mapped to reshape its valufrom a Laplacia
image are rastfirst. Rice cod
because of iy in handlinWhen Rice cod
j) is split into
negative integhen saved f
representing E
he compressioode length of E
den ratio. bution parametwn, parameter ameter k for thsily. of Encoding
green plane
en xel
a ue an
ter de its ng de
o a
ger for
E(i,
on E(i,
ter ρ, he
is
When cobe
Decodin
DecodinEncodinthen thedecodedCFA Imtwo sub
Fig 5: St
From thea good predictiothe meadivisor adjustedof Rice c V. COMSimulatiperformabit coloraccordinCFA imthe prop Some resuch as LCMI w S No.
Image 1
oding E(i, j) of
ng Process:
ng Process isng. Green Sube non-green sud green sub-immage is then rec
images.
tructure of Dec
BITRAT
e above fig, it compression pon residue is aan of its value
used to gened accordingly socode.
MPRESSION Pions were caance of proposr images of sizng to the Bayer
mages. These Imosed compress
epresentative LJPEG-LS, JPE
were used for co
JPEG LS
5.467
f non green pla
s just reversb-image is decub-image is demage as a referconstructed by
coder�
TE ANALYSIS
shows that α =performance. Wa local variabl distribution a
erate the Riceo as to improv
ERFORMANCarried out tosed compressio
ze 512*768 wer pattern to formages are dirsion scheme fo
Lossless compreEG 2000(losslomparison of r
JPEG 2000 5.039
ane is defined to
se process ocoded first andcoded with thrence. Origina
y combining th
S
= 1 can providWe assume thle and estimatadaptively. The code is thene the efficiency
CE evaluate thon scheme. 24re sub-sampledrm 8 bit testingectly coded by
or evaluation.
ession schemeless mode) andresults
Proposed
4.803
o
of d e
al e
e e e e n y
e 4-d g y
es d
Image 2Image 3
If we aget impand also
α =0 α =0.6α =0.8α =1
ADV
We canand alssensorscomplegives b
VI. EX
2 6.188 3 6.828
alter the valueproved results o reduce the bi
Overall CRate (in bp4.9496 4.8486 4.8437 4.8366
VANTAGES O
n reduce the spso can get highs in digital cexity to designetter performan
XPERIMENTA
5.218 4.525
Table I
s of weightingin terms of co
it rates of CFA
CFA Bit p)
C
1.1.1.1.
Table-II OF PROPOSE
pectral redundh quality imagcameras from n. Compare wince.
AL RESULTS
4.847 3.847
g factor then wompression rat
A.
ompression Ra
6163 6496 6516 6537
ED METHOD
dancy mean timge. Reducing th
3 to 1. Loith JPEG2000
we tio
atio
me he
ow it
VII. CONCLUSION
CFA image encodes the sub-image separately with predictive coding Lossless prediction is carried out in the intensity domain for the green. While it is carried out in the color difference domain for the non green
VIII.ACKNOWLEDGMENT
The first author express his gratitude to the remaining two authors towards the completion this project.
IX REFERENCES
[1] S. Banks, Signal Processing, Image Processing and Pattern Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1990.
[2] S. P. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inf. Theory, vol. IT-28, no. 2, pp. 129–136, Mar. 1982.
[3] P. Berkhin, “Survey of clustering data mining techniques,” Accrue Software, San Jose, CA, 2002.
[4] J. Besag, “On the statistical analysis of dirty pictures,” J. Roy. Statist. Soc. B, vol. 48, pp. 259–302, 1986.
[5] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002.
[6] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, Aug. 2000.
[7] P. Felzenszwalb and D. Huttenlocher, “Efficient graph-based image segmentation,” Int. J. Comput. Vis., vol. 59, pp. 167–181, 2004.
[8] S. Zhu and A. Yuille, “Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, no. 9, pp. 884–900, Sep.1996.
[9] M. Mignotte, C. Collet, P. Pérez, and P. Bouthemy, “Sonar image segmentation using a hierarchical MRF model,” IEEE Trans. Image Process., vol. 9, no. 7, pp. 1216–1231, Jul. 2000.
[10] M. Mignotte, C. Collet, P. Pérez, and P. Bouthemy, “Three-class Markovian segmentation of high resolution sonar images,” Comput. Vis. Image Understand., vol. 76, no. 3, pp. 191–204, 1999.
[11] F. Destrempes, J.-F. Angers, and M. Mignotte, “Fusion of hidden Markov random field models and its Bayesian estimation,” IEEE Trans. Image Process., vol. 15, no. 10, pp. 2920–2935, Oct. 2006.
[12] Z. Kato, T. C. Pong, and G. Q. Song, “Unsupervised segmentation of color textured images using a multi-layer MRF model,” in Proc. Int. Conf. Image Processing, Barcelona, Spain, Sep. 2003, pp. 961–964.
[13] P. Pérez, C. Hue, J. Vermaak, and M. Gangnet, “Colorbased
probabilistic tracking,” in Proc. Eur. Conf. Computer Vision, Copenhagen,Denmark, Jun. 2002, pp. 661–675.
[14] J. B. Martinkauppi, M. N. Soriano, and M. H. Laaksonen, “Behavior of skin color under varying illumination seen by different cameras at different color spaces,” in Proc. SPIE, Machine Vision Applications in
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-1
OPTIMAL RECEIVER FILTER DESIGN
Vivek Kumar Dr. K. Raj Deptt. of Electronics Engg. Deptt. of Electronics Engg.
IITM. Kanpur Harcourt Butler Technological
5/414, Avas Vikas,Farrukhabad Institute, Kanpur – 208002, India
[email protected] [email protected]
Abstract
In wireless communication systems,
the pulse shaping filters are often
used to represent massage symbols
for transmission through channel &
its matched filter at the receiver
end. This paper deals with the
design & comparison of the optimal
receiver filter that maximize the
signal to interference plus noise
ratio of the received signal. The
first approach is based on
optimizing optimal matched filter
criterion and the second approach is
based on optimizing MMSE
criterion which provides a closed
form analytic solution for the filter
coefficient. In 3G and beyond 3G
systems, higher SIR of the received
signal is required so that higher
order modulation schemes can be
applied to achieve high data
transmission throughput and also
short tap length receiver filters in
order to reduce the power
consumption at the mobile units.
Simulation demonstrated that
receiver filter designed using
MMSE criterion can significantly
improve the system performance by
reducing inter symbol interference
in comparison to the optimal
matched filter.
Key words: MMSE, 3G, AWGN,
QAM, PSK,SIR, SINR
1. Introduction
The fundamental operation of
wireless communication systems is
to encode, modulate, up sample and
then transmits digital information
symbols in a form of analog
waveform through wireless
channel. This analog waveform are
the output of the transmit filters
which include pulse shaping filter,
phase equalizers and R.F. filters.
On the receiver side, the received
waveforms are filtered by receiver
filter which is normally matched to
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-2
the transmitter filter. The output of
the receiver filter is down sampled
then sent into demodulator &
decoded to recover the transmitted
information. Figure 1 shows a
simple communication link system.
s(n)
4X
S^(n) 4X
Figure 1. A communication
link system
Nyquist filters are commonly used
in data transmission systems for
pulse shaping. They have the
property that their impulse
responses have zero-crossing that is
uniformly spaced in time. If the
channel is an AWGN channel, the
Nyquist filter is an ideal pulse
shaping filter since it has an infinite
length of impulse response. The
most well known is using the root
Nyquist filter and its matched filter
which introduces no inter symbol
interference. A practical
approximation of the Nyquist filter
is the raised cosine filter. Several
methods [5,11,7,17,4] were
proposed to design general Fir
transmitter pulse shaping filters and
its matched filter that have
orthogonally property as the root
Nyquist filter. These methods can
not be applied to design optimal
receiver filters because these
methods are application specific
and require changing the transmitter
pulse shaping filter to eliminate the
interference which is not easy.
In 3G and beyond 3G systems,
higher order modulation schemes
such as 8-PSK,16-QAM are used to
increase the data transmission
throughput. These schemes require
higher SIR of the received signal so
that the transmission is reliable. The
receiver filter provides higher SIR
of the received signal and must
have short tap length so that we
have large noise margin & less
power consumption at the mobile
units. So the main targets of the
receiver filter design are following
(i) Maximizing SINR of the
received signal to use higher order
modulation schemes.
(ii) Receiver filter must have short
tap length for less power
consumption at the mobile units
g(i) Tx RF
f(i) Rx RF
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-3
To design optimal receiver filter,
we use two approaches, the first
approach is based on optimal
matched filter criterion which is
design using the optimal filter
design method and have same
property of the transmitter filter
such as band edges frequencies,
pass band and stop band ripples etc.
The second approach is based on
MMSE criterion which minimize
the error the transmitted signal and
the received signal. This approach
leads a closed form analytic
solution of the receiver filter
coefficients and can be extended to
design adaptive receiver filters.
Here optimally is in the sense of
maximizing signal to interference
plus noise ratio of the received
signal. Simulation demonstrated
that the receiver filter design using
MMSE approach can significantly
improve signal to interference plus
noise ratio (SINR) of the received
signal as compared to the receiver
filter design using optimal matched
filter.
2. Requirements for receiver
filter design
We consider the following
requirements for the design of the
optimal receiver filter.
[R1] We want to design a receiver
filter, whose impulse response is
f (i) and filter length is L means ,i
= 0,...,L − 1 , such that the
received signal s ˆ(i) has the
highest SINR. With given an FIR
transmitter filter, whose impulse
response is g ˆ (i) where i =0,..., N −
1.
[R2] Since the transmitted signal is
bandwidth limited, the side lobe of
the receiver baseband filter in stop
band means for frequency greater
than 740 kHz (f ≥ 740 kHz), there
should be sharp cut of less then −40
dB.
[R3] The wireless channel is
frequency non-selective and has
only one path.
However, we add the requirement
[R2] as a constraint on adjacent
channel interference power.
3. Optimal receiver filter design
In this section, we present two
approaches to design optimal
receiver filters that maximizing
signal to interference plus noise
ratio of the received signal. With
the receiver filter being the optimal
matched filter, the maximal signal
to noise ratio of the received signal
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-4
can be achieved but the signal to
interference ratio is not too good.
3.1 Optimal matched filter
approach
In the optimal design method, the
weighted approximation error
between the actual frequency
response & the desired filter
response is spread across the pass
band and stop band and the
maximum error is minimized. This
design method results ripples in
pass band & stop band. So the
frequency of the filter in the pass
band and the stop band
respectively.
1 - δp ≤ │H(ejω
)│≤ 1+δp
│ω│≤ ωp
-δs ≤ │ H(ejω
)│≤ δs
│ω│≥ ωs
Where, δp = pass band ripple & δs =
maximum attenuation in the stop
band.
The weighted approximation error
is defined as
E(ω) = W(ω) [Hd(ω) - H(ejω
)]
Where, Hd(ω) = desired frequency
response & H(ejω
) = actual
frequency response.
The filter parameters are
determined such that the maximum
absolute value of E(ω) is
minimized. By using the remez
exchange algorithm, we can design
a filter which has optimal set of
filter coefficients such that receiver
filter being matched [f(i) = g(N-i)]
to the transmitter filter.
3.2 MMSE approach
In this approach, we derive the
signal model of the received signal
and assume that the information
symbol sequence is white noise
random process. In the
communication link system as
shown in Fig.1, we assume that the
channel impulse response h (t) has
been estimated using a pilot signal
or using blind channel identification
algorithms.
If the impulse response of
transmitter filter and channel is g
which is represented by the
convolution of gˆ (t) and h (t) i.e. g
= gˆ ∗ h, where g has a finite
support on [0, N − 1]. Thus the
combined impulse response of the
transmitter, channel and receiver
baseband filtering result is denoted
by the convolution g ∗ f. The
impulse response of g ∗ f is
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-5
h^(k) =
1N
oi
g(i)f(k-i),
k=0,1,…………….N+L-1,
(1)
We require sum of length of
transmitter filter and length of
receiver filter is an even number i.e.
N+L is an even number and filter
has linear phase property. So we
can let,
h (k) = ĥ (k-(N+L-2))
Then h (k) is represented as
=
)1(
.
...)1(
)0(....
....
...
.)1(
)0(
Ng
Ng
g
g
g
)1(
.
.
.
)0(
Lf
f
h(k)=GF, (2)
Where G is a Toeplitz matrix of
g(k) and F is a vector of f(k).
After 4X down sampling at the
receiver, the received signal can be
represented as
ŝ (k) = h(0) s(k) + ]8/)2[(
0]8/)2[(
LN
iLNi
0(k) + nACI(k)
(3)
where the first term is the desired
signal, second term represents the
ISI, third term represents the noise
present in the received signal and
forth term represents the adjacent
channel interference on the right
hand side of equation (3).The
transmitted signal is s (k) and the
received signal is s^ (k). The mean
square error betweenthe transmitted
and the receiver signal is given by
Minimize: MSE = E [( s (k) - ŝ (k))
2] (4)
By equation (2), we have h (i) = Gi
F, where Gi is the i-th row of the
matrix G. We define a matrix Ĝ
made by the rows G 4i , where − [
(N + L − 2)/8 ]≤ i ≤ [(N + L − 2)/8
] and i ≠ 0.
Let the frequency response of
the receiver filter at frequency fi
be represented as
₣iF, where ₣i is a row of the
complex Fourier transform matrix ₣
corresponding to the frequency fi,
i.e.,
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-6
₣=
.)/)1(2exp(....)/2exp(1 FsfiLjFsfij
Fs is the sampling frequency. The
MSE in (4) is approximately
MSE ≈ ║ĜF- δ ║2
+ N0 ║F║2
+ λ
║₣^F║2 (5)
Where N0 is the power spectral
density of the channel noise, λ is
the flat power spectral density in
the adjacent frequency band,
δ = [0 … … 1 … … 0]T
,
₣^ = [₣j
T ₣j+1
T …………
₣MT]
T
fj = 740Hz to 3.4KHz(voice
frequency) & M is the number of
frequency sampling points.
Thus, the minimum mean square
error problem (16) becomes,
Minimize: ║ĜF- δ ║2 + N0
║F║2+λ║₣^F║
2 (6)
The receiver filter which minimize
the mean square of the estimation
error in (6) is
F=( ĜĜT + N0I + λ ₣
^₣
^T)-1
ĜT δ(7)
Where I is an identity matrix. This
analytic solution can be applied in
designing an adaptive receiver filter
with channel being estimated. In
designing the receiver filter, the
ACI power and the channel noise
are not known in advance. The
parameters λ and N0 can be adjusted
to meet the side lobe requirement
and to optimize the transition band
but the adjustment of these
parameters are not easy and not
straightforward.
4. Comparison between design
filters
The frequency response of the
optimal receiver filter design using
MMSE approach and using optimal
matched approach is shown in
figure 2. We observe that the
MMSE receiver filter has more flat
frequency response as compared to
the matched filter in the passband.
This frequency response is close to
the frequency response of the root
Nyquist filter which has a flat
frequency response in the passband.
We also observe that receiver filter
designed using MMSE approach
has high skirt than the filter
designed using the matched filter
approach.
Now we compare the receiver
filters designed by both approach
using a simple wireless
communication system. We assume
that wireless channel is frequency
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-7
non-selective and has only one
path. In 3G systems, we require to
transmit data rate as high as
possible. To increase the data
transmission throughput, we have
to use spectral efficient modulation
schemes such as 16- QAM. Due to
increasing the data throughput,
there is high ISI problem.
0 0.5 1 1.5 2 2.5-100
-80
-60
-40
-20
0
20
Frequency(MHz)
Norm
alized m
agnitude response(dB
)
Matched filterdata1
MMSE filter
Figure 2. Frequency response of
receiver pulse shaping filters
We know that eye diagram provides
a great deal of useful information
about the performance of a data
transmission system. The eye
diagram of using 48 tap optimal
matched receiver filter shows that
eyes are very small due to high ISI.
-0.5 0 0.5-4
-2
0
2
4
Time
Am
plit
ude
Eye Diagram for In-Phase Signal
-0.5 0 0.5-4
-2
0
2
4
TimeA
mplit
ude
Eye Diagram for Quadrature Signal
Figure 3. Eye diagram of
received signal using optimal
matched filter
Figure 4 shows the eye diagram of
receiver pulse shaping filter design
using MMSE approach. In this eye
diagram, we can resolve that 16-
QAM modulated signal can be
received reliably using this 48-tap
receiver filter. We can
examine receiver filter design
using MMSE approach is racier to
sampling time error of the received
signal and provides large noise
margin to the system as compared
to the receiver filter design using
optimal matched approach.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-8
-0.5 0 0.5-4
-2
0
2
4
Time
Am
plit
ude
Eye Diagram for In-Phase Signal
-0.5 0 0.5-4
-2
0
2
4
Time
Am
plit
ude
Eye Diagram for Quadrature Signal
Figure 4. Eye diagram of
received signal using receiver
MMSE filter
We compare the optimal matched
receiver filter and optimal receiver
filter design using MMSE approach
for tap length L = 36, 48 and 64
respectively. From Table 1, we
observe MMSE receiver filter can
provide higher SINR but the cost
we bear for significantly gain of
SINR is a slight degradation of
SNR. We can achieve higher SINR
using matched filter with more
number of tap length but this
increase the power consumption at
mobile units. We can increase the
SNR of the received signal by
increasing power spectral density of
the channel noise in the MMSE
filter & when channel noise = ∞,
the MMSE filter is the same as the
optimal matched filter.
Table 1. Comparison between
optimal matched filter and
MMSE filter with different no. of
taps
5. Conclusion
In 3G and beyond 3G system,
higher SIR of the received signal is
required so that high order
modulation schemes such as 8-
PSK, 16-QAM can be applied from
which we can achieve high data
Number of
taps
36 48 64
SINR/SIN
Rmatched(d
B)
5.36
4
9.865 20.40
56
SNR/SNR
matched(dB)
-
0.51
71
-
0.225
0
-
0.353
4
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-9
transmission throughput. By
designing receiver filter using
approaches presented in section 3,
we can achieve high data
transmission throughput. The first
design method is based on optimal
matched criterion and the second
approach is based on optimizing an
MMSE criterion which provides an
analytic solution of filter
coefficients. By simulations, we can
demonstrate that there is
significantly improvement in
performance of using the optimized
receiver pulse shaping MMSE filter
over the optimal matched filter.
References
[1] N.C. Beaulieu, C.C. Tan, and
M.O. Admen. A “better than”
Nyquist pulse. IEEE
Communications Letters, 5(9):367
–368, September 2001.
[2] T. Berger and D.W. Tufts.
Optimum pulse amplitude
modulation part I: Transmitter-
receiver design and bounds from
information theory. IEEE Trans.
Information Theory, 13(2):196 –208, April 1967.
[3] J.O. Coleman and D.W. Lytle.
Linear programming techniques for
the control of intersymbol
interference with hybrid FIR/analog
pulse shaping. In IEEE Int. Conf.
Commun., Chicago, IL,
June 1992.
[4] T.N. Davidson, Z.Q. Luo, and
K.M. Wong. Design of orthogonal
pulse shapes for
communications via semide finite
programming. IEEE Trans. Signal
Processing, 48(5):1433–
1445, May 2000.
[5] H. Samueli. On the design of
FIR digital data transmission filters
with arbitrary magnitude
specifications. IEEE Trans.
Circuits Syst., 38(12):1563 –1567,
December 1991.
[6] L. Tong, G. Xu, and T.
Kailath. Blind identi fication and
equalization based on second-order
statistics: A time domain approach.
IEEE Trans. Information Theory,
40(2):340–349,
March 1994.
[7] J. Tuqan. On the design of
optimal orthogonal finite order
transmitter and receiver filters for
data transmission over noisy
channels. In Proc. of the 34th
Asilomar Conf. on Signals,
Systems and Computers, volume 2,
pages 1303 – 1307, October 2000.
[10] Haykin Simon. “Adaptive
filter theory” fourth edition ,
Pearson Education , Delhi, pp.
436-460.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-10
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0202-1
Signal Acquisition and Analysis System Using
LabVIEW Subhransu Padhee, Yaduvir Singh
Department of Electrical and Instrumentation Engineering
Thapar University, Patiala, 147004, Punjab [email protected], [email protected]
Abstract- In the present era virtual instrumentation
technique is considered as a separate discipline of
engineering education. It has replaced the
conventional technique of measurement and data
acquisition and taken the instrumentation
experiment in to a new level. With easy to use,
graphical programming enabled software, supported
by dedicated, easy to use hardware virtual
instrumentation has transformed the notion of
engineering education and simulation based
experiments.
This paper gives a brief idea of the need and
advantages of virtual instrumentation in engineering
education and discusses the need of distant
laboratory in engineering education. It also
develops a simple application for signal acquisition,
analysis and storage.
Keywords- LabVIEW, virtual instrumentation
I. INTRODUCTION
Acquiring multiple data, the data may be analog
or discrete in nature from the field or process at
high speed using multi channel data acquisition
system, processing the data with the help of a data
processing algorithm and a computing device and
displaying the data for the user is the elementary
need of any industrial automation system [1,2,3,4].
Modern day process plants, construction sites,
agricultural industry [11], petroleum, wireless
sensor network [16], power distribution network
[17], refinery industry, renewable energy system
[10,28] and every other industry where data is of
prime importance use wireless data acquisition, data
processing and data logging equipments. Acquiring
data from the field with the help of different sensor
is always challenging. Different kinds of noises are
super imposed in the data which comes from the
field with the help of transducers and data
acquisition system. After acquiring data from the
field, the signal processing operation is performed.
In signal processing operation, different noises
which are super imposed in the original process
signal is removed and the signal is amplified so that
the signal keeps its original traits and the data
which comes with the signal remains intact. After
the signal processing part, the data is given to a data
processing algorithm which processes the data and
stores the data in a memory unit.
With the advantage of technology
personal computers with PCI, PXI/compact PCI,
PCMCIA, USB, IEEE 1394, ISA, VXI, serial and
parallel ports are used for data acquisition, test and
measurement and automation. Personal computers
are linked with the real world process with the help
of OPC, DDE protocol and application software is
used to form a closed loop interaction between the
real world process, application software and
personal computing unit. Many of the networking
technologies that have already been available for a
long time in industrial automation (e.g., standard
and/or proprietary field and control level buses),
besides having undertaken great improvements in
the last few years, have also been progressively
integrated by newly introduced connectivity
solutions (Industrial Ethernet, Wireless LAN, etc.).
They have greatly contributed to the technological
renewal of a large number of automation solutions
in already existing plants. Obviously, even the
software technologies involved in the
corresponding data exchange processes have been
greatly improved; as an example, today it is
possible to use a common personal computer in
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0202-2
order to implement even complex remote
supervisory tasks of simple as well as highly
sophisticated industrial plants.
This paper gives an overview of modern day
industrial automation system comprising of data
acquisition system and data loggers. This paper
develops a secured data acquisition and analysis
module using virtual instrumentation concept. With
the help of this system the operator can securely
login to the system and perform the desired signal
acquisition and analysis operation. The system also
stores the relevant data for future reference and
record keeping purpose.
II. INDUSTRIAL AUTOMATION SYSTEM
Most measurements begin with a transducer, a
device that converts a measurable physical quantity,
such as temperature, strain, or acceleration, to an
equivalent electrical signal. Transducers are
available for a wide range of measurements, and
come in a variety of shapes, sizes, and
specifications. Signal conditioning can include
amplification, filtering, differential applications,
isolation, simultaneous sample and hold (SS&H),
current-to-voltage conversion, voltage-to-frequency
conversion, linearization and more.
Figure 1: Block diagram of data acquisition and
logging
Figure 1 shows the schematic diagram of
data acquisition system. Sensor is used to sense the
physical parameters from the real world. The output
of the sensor is provided to the signal conditioning
element. The main purpose of signal conditioning
element is to remove the noise of the signal and
amplify the signal. The output of the signal
conditioning system is provided to ADC. The ADC
converts the analog signal to the equivalent digital
data. The equivalent digital data is then fed to the
computer, which acts both as a controller and
display element.
Once data has been acquired, there is a need
to store it for current and future reference. Today,
alternative methods of data storage embrace both
digital computer memory and that old traditional
standby-paper. There are two principal areas where
recorders or data loggers are used. Recorders and
data loggers are used in measurements of process
variables such as temperature, pressure, flow, pH,
humidity; and also used for scientific and
engineering applications such as high-speed testing
(e.g., stress/strain), statistical analyses, and other
laboratory or off-line uses where a graphic or
digital record of selected variables is desired.
Digital computer systems have the ability to
provide useful trend curves on CRT displays that
could be analyzed.
III. VIRTUAL INSTRUMENTATION IN DISTANT LAB
To improve the learning methodology in
different discipline in engineering virtual
instrumentation is used. This technique is easy to
use, easy to understand and cost effective. The main
feature is that various simulations can be performed
with the help of programming, which is very
difficult to perform in hardware. State of art virtual
instrumentation system has been reported in
literature which enhances the learning experience of
the students of different discipline. Some of the
discipline where state of art virtual instrumentation
system has been developed are mechanical
engineering [6], power plant training [8],
electronics [9], control system [12], chemical
engineering [14], ultrasonic range measurement
[20], biomedical [21,22], power system [23,24],
electrical machine [25], intelligent control [31].
Laboratories in engineering and applied science
have important effects on student learning. Most
educational institutions construct their own
laboratories individually. Alternatively, some
institutions establish laboratories, which can be
conducted remotely via internet. Different
researchers have proposed the concept of distant
laboratory [7, 18, 19] using internet [27], and using
intranet [26]. Researchers have proposed different
hardware and software architectures for remote
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0202-3
laboratories. General structure of a remote
laboratory is almost the same in every academic
research: Remote clients, a server computer
equipped with an IO module and remote
experimental setup connected to the server.
Figure 2: Architecture of remote laboratory
Figure 2 illustrates, the architecture of the
remote laboratory consists of a server computer
with an industrial network card. Since the network
card is plugged in a PCI slot, it is called PCI card. It
provides required protocol operations for controller
area network.
IV. CASE STUDY
This section develops a signal processing
application using LabVIEW. This application can
be used to teach students about basics of virtual
instrumentation and signal processing. With this
application the student can get a basic knowledge
about signal processing and perform different
applications oriented experiments using LabVIEW
without going in for CRO or DSO. Figure 3 shows
the front panel of the application.
Figure 3: Front panel of the system
This operator console consists of four buttons and
for security reason the operator has to login using
authenticated username and password to access all
other functionality of the system. Figure 4 shows
the login screen. This screen appears when the login
button is pressed.
Figure 4: Front panel of login screen for operator
Figure 5 shows the data acquisition module of the
system where there is control to set the desired
amplitude and frequency of the signal. Noise of
certain amplitude can be added with the signal. This
module shows both the time domain and frequency
domain representation of the noisy signal.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0202-4
Figure 5: Front panel for time domain analysis of
the acquired noisy signal
Figure 5 shows the time domain
representation of the noisy signal where as figure 6
show the frequency domain representation of the
signal. Frequency domain representation involves
the Fourier analysis of the signal.
Figure 6: Front panel for frequency domain analysis
of the acquired signal
The third module of the system is the analysis
module. In this analysis module the operator can
select a certain portion of the signal using the
pointer available. The portion of the signal is
displayed in the subplot and DC value, RMS value,
average value and mean value of the portion of the
signal is displayed. Figure 7 shows the front panel
for waveform analysis.
Figure 7: Front panel for analysis of the subset of
signal
These results can be analyzed and logged to a file
for record keeping and further analysis.
V. CONCLUSIONS
This paper emphasizes on the data acquisition,
supervisory control and data logging aspect of an
industrial process. These areas are of prime
importance for computer control of an industrial
process. The signal is acquired from the filed and
different signal processing and analysis function is
performed on the acquired signal on the selected
portion of the signal. The selected portions of the
signal along with its mathematical values are stored
in a log file for record keeping and future reference
and analysis.
In future scope of the paper, a wireless
web based data acquisition, data logging and
supervisory control system can be implemented.
The main advantage of wireless web based data
acquisition system is that any authorized person in
any where in the world can access the real time
process data with the help of internet. The main
concern area of web based data logging and
supervisory control system is the security of data
and authentication of the user. To solve the above
security need a firewall can be implemented.
References
[1] Joseph Luongo, “A Multichannel Digital
Data Logging System,” IRE Transactions on
Instrumentation, Jun 1958, pp. 103-106.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0202-5
[2] Rik Pintelon, Yves Rolain, M. Vanden
Bossche and J. Schoukens, “Towards an
Ideal Data Acquisition Channel,” IEEE
Transactions on Instrumentation and
Measurement, vol. 39, no. 1, Feb 1990, pp.
116-120.
[3] Deichert, R.L., Burris, D.P., Luckemeyer, J.,
“Development of a High Speed Data
Acquisition System Based on LabVIEW
and VXI,” in Proceedings of IEEE
Autotestcon, Sep 1997, pp. 302-307.
[4] F. Figueroa, S. Griffin, L. Roemer and J.
Schmalzel, “A Look into the Future of Data
Acquisition”, IEEE Instrumentation and
Measurement Magazine, vol. 2, issue 4,
Dec1999, pp. 23–34.
[5] A. Ferrero, L. Cristaldi and V. Piuri,
“Programmable Instruments, Virtual
Instruments, and Distributed Measurement
Systems: what is Really Useful, Innovative,
and Technically Sound”, IEEE
Instrumentation and Measurement
Magazine, vol. 2, issue 3, Sep 1999, pp. 20–
27.
[6] P. Strachan, A. Oldroyd, M. Stickland,
“Introducing Instrumentation and Data
Acquisition to Mechanical Engineers Using
LabVIEW,” International Journal of
Engineering Education, vol. 16, no. 4, Jan
2000, pp. 315-326
[7] K K Tan, T H Lee, F M Leu, “Development
of a Distant Laboratory Using LabVIEW,”
International Journal of Engineering
Education, vol. 16, no. 3, 2000, pp. 273-282
[8] Amit Chaudhuri, Amitava Akuli and Abhijit
Auddy, “Virtual Instrumentation Systems-
Some Developments in Power Plant
Training and Education,” IEEE ACE, Dec
2002
[9] Melanie L Higa, Dalia M Tawy and Susan
M Lord, “An Introduction to LabVIEW
Exercise for an Electronic Class,” 32nd
ASEE/IEEE Frontiers in Education
Conference, Nov 2002, T1D-13-T1D-16
[10] Recayi Pecen, M.D Salim, Ayhan Zora, “A
LabVIEW Based Instrumentation System
for a Wind-Solar Hybrid Power Station,”
Journal of Industrial Technology, vol. 20,
no. 3, Jun-Aug 2004.
[11] Sarang Bhutada, Siddarth Shetty, Rohan
Malye, Vivek Sharma, Shilpa Menon,
Radhika Ramamoorthy, “Implementation of
a Fully Automated Greenhouse using
SCADA Tool like LabVIEW,” in
Proceedings of the 2005 IEEE/ASME
International Conference on Advanced
Intelligent Mechatronics, Jul 2005, pp. 741-
746.
[12] Samuel Daniels, Dave Harding, Mike
Collura, “Introducing Feedback Control to
First Year Engineering Students Using
LabVIEW,” in Proceedings of 2005
American Society for Engineering
Education Annual Conference &
Exposition, 2005, pp. 1-12
[13] Mihaela Lascu and Dan Lascu, “Feature
Extraction in Digital Mammography Using
LabVIEW,” 2005 WSEAS International
Conference on Dynamical Systems and
Control, Nov 2005, pp. 427-432
[14] V M Cristea, A Imre-Lucaci, Z K Nagy and
S P Agachi, “E-Tools for Education and
Research in Chemical Engineering,”
Chemical Bulletin, vol. 50, issue 64, 2005,
pp. 14-17
[15] Ziad Salem, Ismail Al Kamal, Alaa Al
Bashar, “A Novel Design of an Industrial
Data Acquisition System,” in Proceedings
of International Conference on Information
and Communication Techniques, Apr 2006,
pp. 2589-2594.
[16] Aditya N. Das, Frank L. Lewis, Dan O.
Popa, “Data-logging and Supervisory
Control in Wireless Sensor Networks,” in
Proceedings of 7th
ACIS international
conference on software engineering,
Artificial Intelligence, Networking and
Parallel Distributed Computing (SNDP’06),
2006, pp. 1-12
[17] K. S Swarup and P. Uma Mahesh,
“Computerized Data Acquisition for Power
System Automation,” in Proceedings of
Power India Conference, Jun 2006, pp. 1-7.
[18] Francesco Adamo, Filippo Attivissimo,
Giuseppe Cavone, Nicola Giaquinto,
“SCADA/HMI Systems in Advanced
Educational Courses,” IEEE Transactions
on Instrumentation and Measurement, vol.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0202-6
56, no. 1, Feb 2007, pp. 4-10.
[19] Vu Van Tan, Dae-Seung Yoo, Myeong-Jae
Yi, “A Novel Framework for Building
Distributed Data Acquisition and
Monitoring System,” Journal of Software,
vol.2, no.4, Oct 2007, pp. 70-79
[20] A Hammad, A Hafez, M T Elewa, “A
LabVIEW Based Experimental Platform for
Ultrasonic Range Measurements,” DSP
Journal, vol. 6, issue 2, Feb 2007, pp. 1-8
[21] Shekhar Sharad, “A Biomedical
Engineering Start Up Kit for LabVIEW,”
Americal Society f Engineering Education,
2008
[22] Steve Warren and James DeVault, “A Bio
Signal Acquisition and Conditioning Board
as a Cross-Course Senior Design Project,”
in Proceedings of 38th
ASEE/IEEE Frontiers
in Education Conference, 2008, pp. S3C1-
S3C6
[23] S K Bath, Sanjay Kumra, “Simulation and
Measurement of Power Waveform
Distortion Using LabVIEW,” IEEE
International Power Modulators and High
Voltage Conference, May 2008, pp. 427-
434
[24] Nikunja K Swain, James A Anderson and
Raghu B. Korrapati, “Study of Electrical
Power Systems using LabVIEW VI
Modules,” in Proceedings of 2008 IAJC-
IJME International Conference, 2008
[25] M. Usama Sadar, “Synchronous Generator
Simulation Using LabVIEW,” World
Academy of Science, Engineering and
Technology, 39, 2008, pp. 392-400
[26] Muhammad Noman Ashraf, Syed Annus
Bin Khalid, Muhammad Shahrukh Ahmed,
Ahmed Munir, “Implementation of Intranet-
SCADA using LabVIEW based Data
Acquisition and Management,” in
Proceedings of International Conference on
Computing, Engineering and Information,
2009, pp. 244-249.
[27] Zafer Aydogmus, Omur Aydogmus, “A
Web-Based Remote Access Laboratory
Using SCADA,” IEEE Transactions on
Education, vol. 52, no. 1, Feb 2009.
[28] Li Nailu, Lv Yuegang, Xi Peiyu, “A Real
Time Simulation System of Wind Power
Based on LabVIEW DSC Module and
Matlab/Simulink,” in Proceedings of The
Ninth International Conference on
Electronic Measurement & Instruments,
Aug 2009, pp. 1-547-1-552.
[29] Hiram E Ponce, Dejanira Araiza and Pedro
Ponce, “A Neuro-Fuzzy Controller for
Collaborative Applications in Robotics
Using LabVIEW,” Applied Computational
Intelligence and Soft Computing, Hindawi
Publishing Corporation, vol. 2009, 2009, pp.
1-9
[30] Akif Kutlu, Kubilay Tasdelen, “Remote
Electronic Experiments Using LabVIEW
Over Controller Area Network,” Scientific
Research and Essays, vol. 5(13), Jul 2010,
pp. 1754-1758
[31] Pedro Ponce Cruz, Aruto Molina Gutierre,
“LabVIEW for Intelligent Control Research
and Education,” 4th
IEEE International
Conference on E-Learning in Industrial
Electronics, Nov 2010, pp. 47-54
[32] David McDonald, “Work In Progress
Introductory LabVIEW Real Time Data
Acquisition Laboratory Activities,” ASEE
North Central Sectional Conference, Mar
2010, pp. 1B-1-1B-6
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0203-1
Abstract-Orthogonal Frequency Division
Multiple Access is a scheme which divide the
available spectrum into subchannels. The
subchannels are narrowband which makes
equalization very simple.The intercarrier
interference in the subcarriers occurs due to
frequency offset.The OFDM is sensitive to
frequency offset between transmitted and
received carrier frequencies. This results in
the reduction of signal amplitude in the output
of the filters matched to each of the carriers
and the second is introduction of ICI from the
other carriers. The two methods are
investigated for combating the effects of ICI:
ICI Self Cancellation (SC) and Extended
Kalman Filter (EKF) method. The methods are
compared in terms of bandwidth efficiency and
bit error rate. EKF methods perform better
than the SC method as shown by
simulations(upto 256 QAM).
Keywords- Orthogonal frequency Division
Multiplexing(OFDM); Inter Carrier
Interference(ICI); Carrier to Interference Power
Ratio (CIR);Self Cancellation(SC);Carrier
Frequency Offset (CFO); Extended Kalman
Filtering(EKF).
I. INTRODUCTION
The basic principle of OFDM is to split high-rate
datastream into a number of lower rate streams
that are transmitted simultaneously over a number
ofsubcarriers.[1]
One limitation of OFDM in many applications is
that it is very sensitive to frequency errors caused
by frequency differences between the local
oscillators in the transmitter and the receiver [3]–
[5]. Frequency offset causes rotation and
attenuation of each of the subcarriers and
intercarrier interference (ICI) between subcarriers.
[4].Many methods have been developed to reduce
this sensitivity to frequency offset which includes
windowing of the transmitted signal [6], [7] and
use of self ICI cancellation schemes [8]. Here in
this paper, the effects of ICI have been analysed
and two solutions to combat ICI have been
presented. The first method is a self-cancellation
scheme[1], in which redundant data is transmitted
onto adjacent sub-carriers such that the ICI
between adjacent sub-carriers cancels out at the
receiver. The second method, the extended
Kalman filter (EKF), statistically estimate the
frequency offset and correct the offset [7], using
the estimated value at the receiver. The works
presented in this paper concentrate on a
quantitative ICI power analysis of the ICI
cancellation scheme, which has not been studied
previously. The average carrier-to interference
power ratio (CIR) is used as the ICI level
indicator, and a theoretical CIR expression is
derived for the proposed scheme.
OFDM SYSTEM DESCRIPTION OFDM system uses the input bit stream which is
multiplexed into N symbol streams, each with
symbol period T, and each symbol stream is used
to modulate parallel, synchronous sub-carriers
[10]. The sub-carriers are spaced by 1 in
METHODS OF INTERCARRIER INTERFERENCE
CANCELLATION FOR ORTHOGONAL FREQUNCY
DIVISION MULTIPLEXING
Dr. R.L.Yadav Mrs.Dipti Sharma
Prof., ECE Dept. Sr.Lecturer
Galgotia College of Engg.&Tech Apex Institute Of Tech.,
Greater Noida Rampur
email:[email protected] email:[email protected]
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0203-2
frequency, thus they are orthogonal over the
interval (0,Ts). Then, the N symbols are mapped
to bins of an inverse fast Fourier transform
(IFFT). The IFFT bins correspond to the
orthogonal sub-carriers in the OFDM symbol.
Thus, the OFDM symbol is expressed as
where the Xm’s are the baseband symbols on each
sub-carrier. The analog time-domain signal is
obtained using digital to analog(D/A) converter.
This discrete signal is demodulated using an N-
point Fast Fourier Transform (FFT) operation at
the receiver. The demodulated symbol is
where w (m) corresponds to the FFT of the
samples of w (n), which is the Additive White
Gaussian Noise (AWGN) in the channel Then, the
signal is down converted and transformed to a
digital sequence after through an Analog-to-
Digital Converter (ADC). Then following step is
to pass the remaining TD samples through a
parallelto- serial converter and to compute N-
point FFT. The resulting Yi complex points are
the complex baseband representation of the N
modulated sub carriers. As the broadband channel
has been decomposed into N parallel sub
channels.Each sub channel needs an. These blocks
are called Frequency Domain Equalizers
(FEQ).The bits on the transmitter are received at
high data rates at receiver.
III. ICI SELF CANCELLATION SCHEME
A. Self-Cancellation
ICI self-cancellation is a scheme that was
introduced by Zhao and Sven-Gustav Häggman[1]
in order to combat and suppress ICI in OFDM.
The input data is modulated into group of
subcarriers with coefficients such that the ICI
signals so generated within that group cancel each
other.Thus it is called self-cancellation method.
1) Cancellation Method
The data pair (X ,- X ) is modulated on to two
adjacent subcarriers (l,l +1) . The ICI signals
generated by the subcarrier l will be cancelled out
significantly by the ICI generated by the
subcarrier l +1. In considering a further reduction
of ICI, the ICI cancellation demodulation scheme
is used. In this scheme, signal at the (k +1)
subcarrier is multiplied by"-1" and then added to
the one at the k subcarrier. Then, the resulting data
sequence is used for making symbol decision.
2). ICI Cancelling Modulation
The ICI self-cancellation scheme requires that the
transmitted signals be constrained such that
X(1) = -X(0), X(3) = -X(2),......., X(N -1) = -X(N -
2) using this assignment of transmitted symbols
allows the received signal on subcarriers k and
k+1 to be written as
and the ICI coefficient S’(l-k) referred as
S’(l-k)=S(l-k)-S(l+1-k) (5)
0 20 40 60 80 100 120-70
-60
-50
-40
-30
-20
-10
0
Subcarrier index k
dB
Comparrison of |S(l-k)|, |S`(l-k)|, and |S``(l-k)| for = 0.4 and N = 128
|S(l-k)|
|S`(l-k)|
|S``(l-k)|
Fig.1 Comparison of |S(l-k)|, |S`(l-k)|, and |S``(l-k)| for N = 128 and
ε = 0.4
3) ICI Canceling Demodulation
ICI modulation introduces redundancy in the
received signal since each pair of subcarriers
transmit only one data symbol. This
redundancy can be exploited to improve the
system power performance, while it surely
decreases the bandwidth efficiency. To take
advantage of this redundancy, the received
signal at the (k + 1)th subcarrier, where k is
even, is subtracted from the kth subcarrier.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0203-3
0 5 10 150
0.2
0.4
0.6
0.8
Subcarrier index k
|S(l-
k)|
0 5 10 15-0.1
0
0.1
0.2
0.3
Subcarrier index k
Real
(S(l-
k))
0 5 10 15
-0.5
0
0.5
Subcarrier index k
Imag
(S(l-
k))
Fig.2 An example of S(l - k) for N = 16; l = 0. (a) Amplitude of S(l
- k). (b) Real part of S(l - k). (c) Imaginary part of S(l - k).
This is expressed mathematically as
Subsequently, the ICI coefficients for this
received signal becomes
S’(l-k) =-S(l-k-1) +2S(l-k) -S(l-k+1) (7)
When compared to the two previous ICI
coefficients S(1-k) for the standard OFDM system
and S(1-k) for the ICI canceling modulation, S ''(1-
k) has the smallest ICI coefficients, for the
majority of l-k values, followed by S(1-k) and S(1-
k) .The combined modulation and demodulation
method is called the ICI self-cancellationscheme..
The theoretical CIR can be derived as
As mentioned above, the redundancy in this
scheme reduces the bandwidth efficiency by half.
This could be compensated by transmitting signals
of larger alphabet size. The Fig. 3 shows the
model of the proposed method.
Fig.3 OFDM Simulation Model
ICI self-cancellation scheme can be combined
with error correction coding. The proposed
scheme provides significant CIR improvement,
which has been studied theoretically and by
simulationsFig. 4 shows the comparison of the
theoretical CIR curve of the ICI self-cancellation
scheme, calculated by, and the CIR of a standard
OFDM system is calculated. As expected, the CIR
is greatly improved using the ICI selfcancellation
scheme. The improvement can be greater than
15dB for 0 < ε < 0.5.
Fig.4 CIR versus ε for a standard OFDM system
EXTENDED KALMAN FILTERING
A. Problem Formulation
A state-space model of the discrete Kalman filter
is defined as
z (n) = a (n)d(n) + v(n) (9)
For the model z(n) has a linear relationship with
the desired value d(n). By using the discrete
Kalman filter, d(n) can be recursively estimated
based on the observation of z(n) and the updated
estimation in each recursion is optimum in the
minimum mean square sense.
The received symbols are
At the receiver
In order to estimate ε efficiently in computation,
we build an approximate linear relationship using
the first-order Taylor’s expansion:
(12-17)
Where is the estimation of
And (15)
And the following relationship:-
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0203-4
which has the same form as, i.e., z(n) is linearly
related to d(n).. As linear approximation is
involved in the derivation, the filter is called the
extended Kalman filter(EKF)
.
B. ICI Cancellation
There are two stages in EKF method to reduce
intercarrier interference.
1). Offset Estimation Scheme
For estimating the quantity ε(n) using an EKF in
each OFDM frame, the state equation is built as
ε(n)=ε(n-1) (18)
i.e., in this case we are estimating an unknown
constant ε. This constant is distorted by a non-
stationary process x(n), an observation of which is
the preamble symbols preceding the data symbols
in the frame. The observation equation is
where y(n) denotes the received preamble
symbols distorted in the channel, w(n) the
AWGN, and x(n) the IFFT of the preambles X(k)
that are transmitted, which are known at the
receiver. Assume there are Np preambles
preceding the data symbols in each frame are used
as a training sequence and the variance σ of the
AWGN w(n) is stationary. The computation
procedure is described as follows.
1. Initialize the estimate and corresponding state
error P(0).
2. Compute the H(n), the derivative of y(n) with
respect to ε (n) at , the estimate obtained in the
previous iteration.
3. Compute the time-varying Kalman gain K(n)
using the error variance P(n- 1), H(n), and σ2
4. Compute the estimate y^(n)using x(n) and ε^(n-
1)., i.e. based on the observations up to time n-1,
compute the error between the true observation
y(n) and y^(n).
5. Update the estimate ε^(n) by adding the K(n)-
weighted error between the observation y(n) and
y^(n) to the previous estimation ε^(n-1).
6. Compute the state error P(n) with the Kalman
gain K(n), H(n), and the previous error P(n-1).
7. If n is less than Np, increment n by 1 and go to
step 2; otherwise stop.
It is observed that the actual errors of the
estimation ε^(n) from the ideal value ε(n) are
computed in each step and are used for adjustment
of estimation in the next step.
The pseudo code of computation is summarized as
Initialize P(n),ε^(0).For n=1,2,…….NP compute
2). Offset Correction Scheme
The ICI distortion in the data symbols x(n) that
follow the training sequence can then be mitigated
by multiplying the received data symbols y(n)
with a complex conjugate of the estimated
frequency offset and applying FFT, i.e.
SIMULATED RESULT ANALYSIS
A. Performance
For the simulations in this paper, MATLAB was
employed with its Communications
Toolbox,Communication Blockset for all data
runs. To compare the two schemes BER
performance curve is used The OFDM transceiver
system was implemented as specified by Fig.
3..Quadrature amplitude modulation QAM(64 ,
128and 256) is used.
PARAMETERS VALUES
Number of carriers 768
Modulation QAM
Frequency offset [0,0.15,0.30]
No. of OFDM symbols 100
Bits per OFDM symbols N*log2(M)
Eb-No 1:15
IFFT size 1024
Fig.5 BER Performance with ICI Cancellation, ε=0.05 for 64-QAM
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0203-5
Fig
.6 BER Performance with ICI Cancellation ε=0.15, ε=0.3 for 128
QAM
Fig.7BER Performance with ICI cancellation ε=0.15, ε-=0.30 for 256 QAM
S.No. Method ε= 0.05 ε= 0.15 ε= 0.30
1 SC 13 dB 12 dB 11 dB
2 EKF 12dB 13 dB 14 dB
Required SNR and improvement for BER of 10^-6 for QAM
CONCLUSION
In this paper, the performance of OFDM systems
in the presence of frequency offset between the
transmitter and the receiver has been studied in
terms of the Carrier-to-Interference ratio (CIR)
and the bit error rate (BER) performance. Inter-
carrier interference (ICI) which results from the
frequency offset degrades the performance of the
OFDM system. Two methods were explored in
this paper for mitigation of the ICI. The ICI self-
cancellation (SC) is proposed . The extended
Kalman filter (EKF) method for estimation and
cancellation of the frequency offset has been
investigated in this paper, and comparison is made
with these two existing techniques. The choice of
which method to employ depends on the specific
application. For example, self cancellation does
not require very complex hardware or software for
implementation. However, it is not bandwidth
efficient as there is a redundancy of 2 for each
carrier. Its implementation is more complex than
the SC method. On the other hand, the EKF
method does not reduce bandwidth efficiency as
the frequency offset can be estimated from the
preamble of the data sequence in each OFDM
frame.
However, it has the most complex implementation
of the two methods. In addition, this method
requires a training sequence to be sent before the
data symbols for estimation of the frequency
offset. It can be adopted for the receiver design for
IEEE 802.11a because this standard specifies
preambles for every OFDM frame. This model
can be easily adapted to a flat-fading channel with
perfect channel estimation. Further work can be
done by performing simulations to investigate the
performance of these ICI cancellation schemes in
multipath fading channels without perfect channel
information at the receiver. In this case, the
multipath fading may hamper the performance of
these ICI cancellation schemes.
REFERENCES:-
[1]P. Tan, N.C. Beaulieu, ―Reduced ICI in
OFDM systems using the better than raised cosine
Pulse,‖ IEEE Commun. Lett, vol. 8, no. 3, pp.
135–137, Mar. 2004.
[2] H. M. Mourad, Reducing ICI in OFDM
systems using a proposed pulse shape, Wireless
Person. Commun, vol. 40, pp. 41–48, 2006.
[3] V. Kumbasar and O. Kucur, ―ICI reduction
in OFDM systems by using improved Sinc power
pulse,‖ Digital Signal Processing, vol.17, Issue
6, pp. 997-1006, Nov. 2007.
[4] Tiejun (Ronald) Wang, John G. Proakis, and
James R.Zeidler“Techniques for suppression of
intercarrier interference in ofdm systems”.
Wireless Communications and Networking
Conference, IEEE Volume 1,Issue, 13-17 pp: 39 -
44 Vol.1,2005.
[5]P. H. Moose, “A Technique for Orthogonal
Frequency Division Multiplexing Frequency
Offset Correction,” IEEE Transactions on
Communications, vol. 42, no. 10, 1994
[6]Y. Zhao and S. Häggman, “Inter carrier
interference self-cancellation scheme for OFDM
mobile communication systems,”IEEE
Transactions on Communications, vol. 49, no. 7,
2001
[7] R. E. Ziemer, R. L. Peterson, ”Introduction to
Digital Communications”, 2Nd edition, Prentice
Hall, 2002.
[8] J. Armstrong, “Analysis of new and existing
methods of reducing inter carrier interference due
to carrier frequency offset in OFDM,” IEEE
Transactions on Communications, vol. 47, no. 3,
pp. 365 – 369., 1999
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0203-6
[9] N. Al-Dhahir and J. M. Cioffi, “Optimum
finite-length equalization for multicarrier
transceivers,” IEEE Transactions
onCommunications, vol. 44, no. 1, pp. 56 – 64,
1996Systems”, (IJCSIS) International Journal of
Computer Science and Information Security,
Vol. 6, No. 3, 2009
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0204-1
OBJECT DETECTION BASED ON CROSS-CORRELATION
USING PARTICLE SWARM OPTIMIZATION
Sudhakar Singh Yaduvir Singh
Department of Electrical and Instrumentation Engineering
Thapar University, Patiala, Punjab
Abstract- In this paper a novel method for
object detection in images is proposed. The
method is based on image template
matching. Conventional template matching
algorithm based on cross-correlation
requires complex calculation and large time
for object detection, which makes difficult to
use them in real time applications. In the
proposed work particle swarm optimization
and its variants based algorithm is used for
detection of object in image. Implementation
of this algorithm reduces the time required
for object detection than conventional
template matching algorithm. Algorithm can
detect object in less number of iteration &
hence less time and energy than the
complexity of conventional template
matching. This feature makes the method
capable for real time implementation.
Keywords: object detection, object tracking
and image matching.
I. INTRODUCTION
It is easy in image to detect the position
of the letters, objects, numbers, for human even
for child, but for computer solve these types of
problems in fast manner is a very challenging
task. In the last decades the computer‟s ability to
perform huge amount of calculations, and handle
information flows we never thought possible ten
years ago has emerged. Despite this a computer
can only extract little information from the
image in comparison to human being. Object
detection is a fundamental component of
artificial intelligence and computer vision.
Interest in pattern recognition is fast growing in
order to deal with the prohibitive amount of
information, we encounter in our daily life.
Automation is desperately needed to handle this
information explosion. The way the human brain
filters out useful information is not fully known
and this skill has not been merged into computer
vision science. This paper proposes to
implement a system that is able to faster
detection of object in an image. Artificial
intelligence is an important topic of the current
computer science research. In order to be able to
act intelligently a machine should be aware of its
environment. The visual information is essential
for humans. Therefore, among many different
possible sensors, the cameras seem very
important. Automatically analyzing images and
image sequences is the area of research usually
called “computer vision. Image matching is key
point for object detection. Image matching has
large no. of applications which includes in
navigation, guidance, automatic surveillance,
robot vision, and in mapping sciences. Cross-
correlation and related techniques are
dominantly used in image matching
applications. It is difficult to use this
Conventional template matching algorithm
based on cross-correlation in real time
applications due to requirement of complex
calculation and large time for object detection
applications. The shortcomings of this class of
image matching methods have caused a slow-
down in the novel development of operational
automated correlation systems. In this paper, we
propose a method for object detection. It
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0204-2
consists three stages, (i) Image matching using
templates. (ii) Object detection. (iii) Then
implementation of PSO technique.
In this paper proposed PSO based
algorithm is better which gives better result as
compare to conventional algorithm.
II. LITERATURE REVIEW
F. Ackermann [1] proposed an image matching
algorithm based on least squares window
matching. Several common object detection and
tracking methods are surveyed in [2], such as
point detectors , background subtraction [7], In
fact, color is one of the most widely used
features to represent the object appearance for
detection and tracking [5]. Most of object
detection and tracking methods used pre-
specified models for object representation. W.
Forstner [3] proposed a feature based
correspondence algorithm for image matching A
W Gruent [4]. The Adaptive Least Squares
Correlation is a very potent and flexible
technique for all kinds of data matching
problems, J. Bala, K.[5]. They address the
problem of crafting visual routines for detection
tasks. C.F.Olson [6] in image matching
applications such as tracking and stereo
matching. Kwan-Ho Lin, Kin-Man [8] new
method for locating object based on valley field
detection and measurement of fractal
dimensions. Yaakov Hel-Or [10] a novel
approach to pattern matching is proposed in
which time complexity is reduced by two orders
of magnitude compared to traditional
approaches. Kun Peng, Liming Chen [9]
presented a robust eye detection algorithm for
gray intensity images.
III. OBJECT DETECTION
Object detection attempts to determine
the existence of specific object in an image and,
if object is present, then it determines the
location, size and shape of that object. In
computer vision, object detection and tracking is
an active research area which has attracted
extensive attentions from multi-disciplinary
fields, and it has wide applications in many
fields like service robots, surveillance systems,
public security systems, and virtual reality
interfaces. Detection and tracking of moving
object like car and people are more concerned,
especially flexible and robust tracking
algorithms under dynamic environments, where
lightening condition may change and occlusions
may happen. The general process of object
detection consists of two steps. The first step is
building models. The second step is according to
the prior knowledge of the interested objects, the
feature model is built up to describe the target
object and separate it from other objects and
backgrounds. And since most images are noisy,
statistic information are usually adopted to
quantify features. The second step is to find a
particular region in the image; called area of
interest (AOI), which either can best fit the
object model or has the highest similarity with
the model. Many algorithms developed recently
in this area relate to human face detection and
recognition due to its potential applications in
security and surveillance. Yet, generic, reliable,
and fast human face detection was, until very
recently, impossible to achieve in real-time. The
concepts involved in object detection, object
recognition, and object tracking often overlap.
Each of these computer vision techniques tries to
achieve the following: Object Tracking:
dynamically locates objects by determining their
position in each frame. Object Detection and
Recognition has made significant progress in the
last few years. Many algorithms developed
recently in this area relate to human face
detection and recognition due to its potential
applications in security and surveillance.
IV. TEMLATE MATCHING BASED ON
CROSS CORRELATION
Template matching is a popular method
for pattern recognition. It is defined below:
Definition: Let I be an image of dimension m×n
and T be another image of dimension p×q such
that p<m and q<n then template matching is
defined as a search method which finds out the
portion in I of size p×q where T has the
maximum cross correlation coefficient with it.
The normalized cross correlation coefficient is
defined as:
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0204-3
( , ) ( , )
s t
2 2
( , ) ( , )
Y(x,y)=I x s y t I s t
I x s y t I s t
s t s t
P P
P P
(1)
Where
( , )
( , )
( , ) ( , )
( , )
I x s y t
I x y
P I x s y t I x y
P T s s T
(1,2,3....... )s p , (1,2,3.... )t q and
(1,2,3.... 1)x m p ,
(1,2,3.... 1)y n q
Also
1( , ) ( , )
s t
I x y I s t y tpq (2)
1( , )
s t
T T s tpq (3)
The value of cross-correlation coefficient Y
ranges in [-1, +1]. A value of +1 indicates that T
is completely matched with I(x, y) and -1
indicates complete disagreement. For template
matching the template, T slides over I and Y is
calculated for each coordinate (x, y). After
completing this calculation, the point which
exhibits maximum Y is referred to as the match
point.
V. PARTICLE SWARM OPTIMIZATION
Particle Swarm Optimization (PSO)
algorithm is a kind of evolutionary
computational technique developed by Kennedy
and Eberhart in 1995 [5]. Like other
evolutionary techniques, PSO also uses a
population of potential solutions to search the
explore space. In PSO, the population dynamics
resembles the movement of a “birds‟ flock”
searching for food, while social sharing of
information takes place and individuals can gain
and from the discoveries previous experience
from all other companions. Thus, the companion
(called particle) in the population (called swarm)
is assumed to “fly” over these search space in
order to find promising region of the landscape.
Let, particle i of the swarm is represented by the
dimensional vector xi = (xi1, xi2, ….,xid ) and the
best particle of the swarm, is denoted by the
index g. The best previous position of particle i
is recorded and represented as pi= (pi1,pi2,
…,pid). The position change (velocity) of
particle i is Vi=(Vi1, Vi2, …, Vid). Particles
update their velocity and position through
tracing two kinds of „best‟ value. One is its
personal best (pbest), which is the location of its
highest fitness value. Another is the global best
(gbest), which is the location of overall best value,
obtained by any particles in the population.
Particles update their positions and velocities
according to the following equations:
vj(i) = wvj(i-l) +r1 c1(pbest(j) – xj(i)) +r2 c2(gbest–
xj(i)) (4)
pj (i) = pj(i - 1) + vj(i) (5)
Where, vj(i) is the velocity of the jth particle in
the ith iteration, pj (i) is the corresponding
position, pbest and gbest corresponding persona
lbest and global best respectively, the variables
w is the inertia weight, the variables c1 and c2
are the accelerate parameters and r1 and r2 are
random numbers . A number of scientists have
created computer simulations of various
interpretations of the movement of organisms in
a bird flock or fish school. Notably, Reynolds
and Heppner and Germander presented
simulations of bird flocking. It became obvious
during the development of the particle swarm
concept that the neighbours of the population of
agents are more like a swarm than a flock. The
term swam has a basis in the literature. In
particular, the authors use the term in
accordance with a paper by Millons, who
developed his models for applications in
artificial life, and articulated five basic
principles of swarm intelligence. First is the
proximity principle: the population should be
able to carry out simple space and time
computations. Second is the quality principle:
the population should be able to respond to
quality factors in the environment. Third is the
principle of diverse response: the population
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0204-4
should not commit its activities along
excessively narrow. Fourth is the principle of
stability: the population should not change its
mode of neighbour every time the environment
changes. Fifth is the principle of ability: the
population must be able to change the behaviour
mode when it‟s worth the computational price.
Note that principles four and five are the
opposite sides of the same coin. Particle swarm
optimization concept and paradigm presented
seem to adhere to all five principles. Basic to the
paradigm are n-dimensional space calculations
carried out over a series of time steps. The
population is responding to the quality factors
local best. Further, liccvcs discusses particle
systems consisting of clouds of primitive
particles as models of diffuse objects such as
clouds, fire and smoke. Thus the label the
authors have chosen to represent the
optimization concept is particle swarm.
Figure 1: Flow chart of PSO
VI.EXPERIMENTAL RESULTS AND
DISCUSSION
The algorithm of particle swarm
optimization is applied for image and different
templates for solving the problem of object
detection. Each image is tested on more than 15
times.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0204-5
Figure 2: Test image-1
Figure (2.1) (2.2) (2.3)
(2.1) Left eye taken as a template (T1.1)
(2.2) Right eye taken as template (T1.2)
(2.3) Nose taken as template (T1.3)
Figure 3: Test image-2
Figure (3.1) (3.2) (3.3)
(3.1) Right eye taken as a template (T2.1)
(3.2) Left eye taken as template (T2.2)
(3.3) Nose taken as template (T2.3)
Figure 4: Test image-3
Figure (4.1) (4.2) (4.3)
(4.1) Right eye taken as a template (T3.1)
(4.2) Left eye taken as template (T3.2)
(4.3) Nose taken as template (T3.3)
Table I below shows comparison of pso and
conventional algorithm.
Ima
ges
Temp
lates
Iterat
ions
Conven
tional
algorith
m Time
taken in
sec.
PSO
Algor
ithm
Time
taken
in sec.
%
Redu
ced
Time
by
PSO
in
sec.
Ima
ge
(1)
Temp
late
1.1
100 57.38 3.90 93.2
1
Temp
late
1.2
100 110.14 4.14 96.2
4
Temp
late
1.3
100 130.56 4.36 96.6
6
Ima
ge
(2)
Temp
late
2.1
100 111.70 4.46 96.0
1
Temp
late
2.2
100 120.60 4.53 96.2
6
Temp
late
100 143.34 4.65 96.7
5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0204-6
2.3
Ima
ge
(3)
Temp
late
3.1
100 59.47 3.72 93.7
5
Temp
late
3.2
100 52.43 3.67 93.0
1
Temp
late
3.3
100 74.37 3.89 94.7
7
Table 1: Comparison of conventional and PSO
based algorithms
By conventional algorithm template
time taken for detection of right eye (object) in
test image1 by matching of template1.1 is 57.38
sec., while in the proposed algorithm the time
taken is 3.90 sec. and hence time reduces up to
93.21% by proposed algorithm. By conventional
algorithm time taken for detection of left eye
(object) in test image1 by matching of
template1.2 is 110.14 sec., while in the proposed
algorithm time is 4.14 sec. and reduces time up
to 96.24%. By conventional algorithm time
taken for detection of nose (object) in test
image1 by matching of template1.3 is 130.56
sec., while in the proposed algorithms the time
taken is 4.34 sec. and reduces time up to
96.66%. By conventional algorithm time taken
for detection of right eye (object) in test image2
by matching of template2.1 is 111.70 sec., while
in the proposed algorithm the time taken is 4.46
sec. and reduces time up to 96.01% By
conventional algorithm time taken for detection
of left eye (object) in test image2 by matching of
template2.2 is 120.14 sec., while in the proposed
algorithm time taken is 4.53 sec. and reduces
time up to 96.26%. By conventional algorithm
time taken for detection of left eye (object) in
test image2 by matching of template2.3 is
143.34 sec., while in the proposed algorithms
the time taken is 4.65 sec. and reduces time up
to 96.75%. By conventional algorithm time
taken for detection of left eye (object) in test
image3 by matching of template3.1 is 59.47 sec.,
while in the proposed algorithm the time taken is
3.72 sec. and reduces time up to 93.75% By
conventional algorithm time taken for detection
of left eye (object) in test image3 by matching of
template3.2 is 52.43 sec., while in the proposed
algorithms the time taken is 3.62 sec. and
reduces time up to 93.01% By conventional
algorithm time taken for detection of left eye
(object) in test image3 by matching of
template3.2 is 74.37 sec., while in the proposed
algorithms the time is 3.89 sec. and reduces time
up to 94.77%. Thus PSO is successfully
employed to solve the object detection problem.
The results show that the proposed method is
capable of obtaining higher quality solution
efficiency. Here time taken is considered as the
efficiency measure. It is clear from the results
that the proposed PSO based method can avoid
the shortcoming (large time taken) of Old
template matching algorithm and can provide
higher quality solution with better computation
efficiency.
VII. CONCLUSIONS
When the sample test images are tested
on PSO based algorithm for detecting the
position of object then it is found that the
algorithms are capable of detecting the position
of object in image with very less time as
compared to conventional template matching
algorithm. The PSO based algorithm has
superior features, including high-quality
solution, stable convergence characteristic and
good computation efficiency. The future work of
the proposed paper is to detect the exact location
of object by segmentation by finding area and
perimeter of object.
REFERENCES
1. Ackermann, F. 1984. “Digital image
correlation: Performance and potential
application in photogrammetry”.
Photogrammetric Record 11
2. T.Peli, “An algorithm for recognition and
localization of rotated and scaled objects”,
Proceedings of the IEEE 69, 1981 483–485.
3. Foerstner,W.,“Quality assessment of object
location and point transfer using digital
image correlation techniques. International
Archives of Photogrammetry and Remote
Sensing” vol. XXV, A3a, Commission III,
Rio de Janeiro, 1984.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0204-7
4. A W Gruent, “Adaptive least squares
correlation: A powerful image matching
Technique.”, South African Journal of
Photogrammetry, Remote Sensing &
Cartography, 14(3):175–187, 1985.
5. J. Bala, K. DeJong, J. Huang, H. Vafaie, H.
Wechsler, “Visual routine for eye detection
using hybrid genetic architectures,”
International Conference on Pattern
Recognition, vol. 3, pp. 606-610, 1996.
6. C.F. Olson, “Maximum-likelihood template
matching,” IEEE Conference on Computer
Vision and Pattern Recognition, vol. 2, pp.
52-57, 2000.
7. Feng Zhao, Qingming Huang, Wen Gao,
“Methods of image matching by normalized
cross-correlation”.
8. Kwan-Ho Lin, Kin-Man Lam and Wan-Chi
Siu, "Locating the Eye in Human Face
Images Using Fractal Dimensions," IEE
Proceedings - Vision, Image and Signal
Processing, vol. 148, no. 6, pp. 413-421,
2001.
9. Kun Peng, Liming Chen, Su Ruan, Georgy
Kukharev, “A Robust and Efficient
Algorithm for Eye Detection on Gray
Intensity Face,” Lecture Notes in Computer
Science – Pattern Recognition and Image
Analysis, pp. 302-308, 2005.
10. Yacov Hel-Or, Hagit Hel-Or, “Real-Time
Pattern Matching Using Projection Kernels”,
IEEE transactions on pattern analysis and
machine pattern analysis and machine
intelligence, Vol. 27, No. 9, September,
2005.
11. Zeng Yan et.al „‟A New Background
Subtraction Method for on-road Traffic,”
Journal of Image and Graphics, vol.13,
No.3, pp.593-599, March, 2008
12. Wei-feng Liu et.al „‟ A Target Detection
Algorithm Based on Histogram Feature and
Particle Swarm Optimization‟‟ 2009 Fifth
ICONC.
1
“Multisegmentation through wavelets:
Comparing the efficacy of Daubechies vs
Coiflets"
Madhur Srivastava,Member, IEEE, Yashwant Yashu, Member, IETE, Satish K. Singh,Member, IEEE,
Prasanta K. Panigrahi
Abstract--- In this paper, we carry out a
comparative study of the efficacy of wavelets
belonging to Daubechies and Coiflet family in
achieving image segmentation through a fast
statistical algorithm.The fact that wavelets
belonging to Daubechies family optimally capture
the polynomial trends and those of Coiflet family
satisfy mini-max condition, makes this
comparison interesting. In the context of the
prseent algorithm, it is found that the
performance of Coiflet wavelets is better, as
compared to Daubechies wavelet.
Keywords: Peak Signal to Noise Ratio,
Segmentation, Standard deviation, Thresholding,
Weighted mean.
Madhur Srivastava is final year B.TECH student in the
Department of Electronics and Communication Engineering at Jaypee University of Engineering and Technology, Guna, India;
e-mail: [email protected]
Yashwant Yashu is final year B.TECH student in the Department of Electronics and Communication Engineering at
Jaypee University of Engineering and Technology, Guna, India;
e-mail: [email protected] Satish K. Singh is Assistant Professor in the Department of
Electronics and Communication Engineering at Jaypee University
of Engineering and Technology, Guna, India e-mail: [email protected]
Prasanta K. Panigrahi is Professor in the Department of Physics
at Indian Institute of Science Education and Research, Kolkata, India; (Phone No. +91-9748918201) e-mail:
I. INTRODUCTION
Thresholding of an image is done to reduce the
storage space, increase the processing speed and
simplify the manipulation as less number of levels
are there compared to 256 levels of normal image.
Primarily, thresholding are of two types – Bi-level
and Multi-level [1].
Bi-level thresholding consists of two values – one
below the threshold and another above it. While in
Multilevel thresholding, different values are assigned
between different ranges of threshold levels. Various
thresholding techniques have been categorized on the
basis of histogram shape, clustering, entropy and
object attributes [2].
Wavelet Transform is very significant tool in the
field of image processing. The wavelet transform of
an image comprises four components –
Approximation, Horizontal, Vertical and Diagonal.
The process is recursively used in approximation
component of wavelet transform for farther
decomposition of image until only one coefficient is
left in approximation part [3-5].
As is well known, Daubechies family are useful in
extracting polynomial trends through their low-pass
coefficients satisfying vanishing moments conditions:
. 0n
j kx dx
(1)
This is due to the fact that, the wavelets of
1
, 2 2j
j k ix k (2)
For n ≤ N , the values of n depend on the particular of
this Daubechies basis makes them well suited for
isolating smooth polynomial features in a given
image. The Coiflet coefficient on the other hand ,
satisfy the mini -max condition, i.e, the maximum
2
error in extracting local features is minimized in this basis set. Hence, it is worth comparing behavior of
Fig. 1. Block diagram of the approach used.
the corresponding wavelet at low-pass coefficients from the perspective of the proposed algorithm.
I. APPROACH
The thresholding applied in wavelet domain takes
into account that majority of coefficients lie near to
zero and coefficients representing large differences
are few at the extreme ends of histogram plot for
each horizontal, vertical and diagonal component.
The coefficients with large differences represent
most significant information of the image. Hence,
the procedure provides for variable size
segmentation with bigger block size around the
mean, and having smaller blocks at the ends of
histogram plot[6]. Following is the methodology
used as shown in Fig. 1
1. Segregate the color image into its Red, Green
and Blue components.
2. Take 2D-wavelet transform of each component
at any level. Do the following for each
Horizontal, Vertical and Diagonal part for every
Red, Green and Blue component.
Threshold the coefficients using
weighted mean and variance of each
sub-band of histogram of coefficients.
Thresholding is done by having
broader block size around mean while
finer block size at the end of histogram.
3. Take inverse wavelet transform for each
component.
4. Reconstruct the image by concatenating Red,
Green and Blue components.
III. RESULTS AND OBSERVATIONS
The proposed algorithm is tested on variety of
standard images using Daubechies and Coiflet
wavelets. The results of PSNR and size of
reconstructed image at different threshold levels are
shown in Table 1. The numbers of threshold levels
taken are 3, 5 and 7. Figure 2 shows the graph of
PSNR w.r.t threshold levels of the image of Lenna
.
Table 1. PSNR and size of reconstructed images using different Daubechies and Coiflet wavelets.
Image
Name
Threshold
Level
Wavelet Name
dB2 dB4 dB6 dB8 coif1 coif2 coif3 coif4 coif5
Lenna 3 PSNR(dB) 34.45 35.19 35.52 35.71 34.50 35.23 35.48 35.61 35.69
Size(kB) 36.2 36.5 36.3 36.2 36.4 36.2 36.4 36.3 36.3
5 PSNR(dB) 36.41 37.13 37.41 37.53 36.5 37.19 37.42 37.54 37.62
Size(kB) 36.2 36.5 36.3 36.3 36.3 36.3 36.4 36.4 36.4
7 PSNR(dB) 36.79 37.5 37.74 37.88 36.84 37.53 37.76 37.89 37.97
Size(kB) 36.2 36.5 36.3 36.3 36.3 36.3 36.4 36.4 36.4
Baboon 3 PSNR(dB) 25.92 26.31 26.29 26.19 25.94 26.20 26.29 26.33 26.36
Size(kB) 74.4 74.2 74.2 74.3 74.4 74.4 74.3 74.3 74.2
5 PSNR(dB) 27.06 27.56 27.44 27.40 27.13 27.41 27.50 27.55 27.58
Size(kB) 74.4 74.1 74.2 74.2 74.3 74.2 74.2 74.2 74.1
A H
D
V
T
H
R
E
S
H
O
L
D
I
N
G
R Component R Component
G Component
B Component B Component
G Component
3
7 PSNR(dB) 27.18 27.70 27.57 27.53 27.27 27.53 27.62 27.67 27.71
Size(kB) 74.3 74.1 74.1 74.1 74.2 74.2 74.1 74.2 74.1
Pepper 3 PSNR(dB) 30.63 33.87 31.61 31.25 31.48 31.63 31.70 31.75 31.77
Size(kB) 39.9 39.8 40.3 40.4 40.1 40.3 40.3 40.2 40.2
5 PSNR(dB) 34.12 35.83 34.61 34.30 33.98 34.41 34.55 34.60 34.62
Size(kB) 39.5 39.7 39.6 39.6 39.6 39.6 39.7 39.7 39.7
7 PSNR(dB) 34.56 36.26 34.92 34.58 34.25 34.73 34.88 34.93 34.95
Size(kB) 39.5 39.8 39.5 39.6 39.6 39.6 39.6 39.6 39.7
Fig. 2 Plot of PSNR vs Threshold levels thresholded using different wavelets of Lenna image
IV. CONCLUSION
Thresholding performed by proposed algorithm gives
better PSNR using coiflet wavelets compared to
Daubechies wavelets while keeping the size of
reconstructed image almost same. This is due to the
unique property of coiflet wavelets satisfying the
mini-max condition. Hence, it can be concluded that
coiflet wavelets provides best and most desirable
results during multilevel thresholding of image in
wavelet domain. In future works, the proposed
algorithm using coiflet wavelets can be used for
image segmentation, object separation, image
compression and image retrieval because only few
coefficients of Horizontal, Vertical and Diagonal
components represent the entire variation of image
without deteriorating the quality.
REFERENCES
1. R.C. Gonzales, R.E. Woods, “Digital Image
Processing,” (2ed., PH, 2001).
2. M. Sezgin, B. Sankur, Survey over image
thresholding techniques and quantitative
performance evaluation, Journal of Electronic
Imaging, 13(1) (2004) 146-165.
4
3. S.G. Mallat, A Wavelet Tour of Signal
Processing. New York: Academic (1999).
4. Daubechies, Ten Lectures on Wavelets, Vol. 61
of Proc. CBMS-NSF Regional Conference
Series in Applied Mathematics, Philadelphia, PA:
SIAM (1992).
5. J.S. Walker,” A Primer on Wavelets and Their
Scientific Applications,” 2nd ed. Chapman &
Hall/CRC Press, Boca Raton, FL, 2008.
6. M. Srivastava, P. Katiyar, Y. Yashu, S.K. Singh,
P.K. Panigrahi,” A Fast Statistical Method for
Multilevel Thresholding in Wavelet Domain,”
unpublished.
1
Analysis of Signals in Fractional Fourier Domain
Ajmer Singh, Student of Lovely Professional University(LPU)-India, Nikesh Bajaj, Asst. Prof., ECE Dept.(LPU) [email protected], [email protected]
Abstract- Fractional Fourier Transform (FRFT) is the
generalization of the classic Fourier Transform (FT). When we dealing with time-varying signals, FRFT is an important tool to analysis these signals. This paper contain the results for variation of basic signals like Rectangular pulse, sine wave and Gaussian signal in the Fractional Fourier Domain (FRFD). The correlation results for FRFD signal to the time domain(TD) and correlation results for FRFT at α-domain to the FRFT at (α-1)-domain also shown and discussed. The graphically proof of scaling property of FRFT is also given.
Index Terms— FRFT, FRFD, Signal Processing, α-domain, Analysis FRFT, FRFT scaling property, α-domain’s correlation
I. INTRODUCTION
The FT is one of the most frequently used tools in signal analysis. However, the FT is not very suitable in dealing with signals whose frequency changes with time because of its assumption that the signal is stationary. The generalization of FT has been proposed in [1] by V. Namias, and is known as FRFT. FRFT also state as “FRFT perform a spectral rotation of the signal in time-frequency plane with variation of α parameter”. In recent years, FRFT has been applied in many areas such as solving differential equations [2], quantum mechanics [1], optical signal processing [6], time variant filtering and multiplexing [3]–[5], swept-frequency filters [6]. Several properties of the FRFT in signal analysis have been summarized in [6].
This paper is divided as following sections. Section II is about the basic concept of FRFT, and also discussed about some properties of FRFT. Section III is about the analysis of different signals, in this section we discussed about the Rectangular pulse, Sine wave and Gaussian signal, also check correlation results for these signals in FRFD. In section IV the conclusion of the paper is discussed.
II. BASIC CONCEPT OF FRFT
The FRFT with angle parameter α of a signal f(t) is defined as,
����� � �� �� ��� � �������������� �������� � � �!�∞
�∞"�
# $�%� & '� � ����%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%# $�%� � �'� � ����%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%# $�%� � ��' ( ��� Fα(u) called as the α- order FRFT of signal f(t). Where α
= Aл/2, and ‘A’ is a real number and is called the order of the FRFT, which is in interval [-2, 2] and can be extended to any real number according to A+4k = A. Where k is any integer like [….-3, -2, -1, 0, 1, 2, 3,….], and A can be any fractional value in interval [-2, 2].
Some basic properties of FRFT are:
• Linearity. • Zero rotation/ Time domain. When A = 0 or 4; α = 0 or 2л; the FRFT operator Fα(u) is
correspond to identity operator. Or F0(u) = f(t). Where f(t) is the time domain signal and F0(u) is the FRFT operator at α=0. • FT is the special case of FRFT. When A = 1; α = л/2; the FRFT operator Fα(u) is
correspond to FT. Or Fл/2(u) = F(t). Where F(t) stand for the Fourier Transform of the time domain signal f(t) and Fл/2(u) is the FRFT operator at α = л/2. • Flipped operation/ time inversion. When A = 2; α = л; the FRFT operator Fα(u) is
correspond to flipped operator. Or Fл(u) = f(-t). Where f(-t) is the flipped version of the input signal f(t) and Fл(u) is the FRFT operator at α = л. • Inverse Fourier Domain. When A = 3; α = 3л/2; the FRFT operator Fα(u) is
correspond to inverse Fourier domain. Or F3л/2(u) = F(-t). Where F(-t) stand for the flipped version of Fourier Transform of the time domain signal f(t) and F3л/2(u) is the FRFT operator at α = 3л/2.
The above properties of FRFT are easily understood by the figure 1.
Figure 1: Time- frequency plane for FRFT.
In this paper, we use the Digital Computation method of
the FRFT which is given in [7].
III. ANALYSIS OF DIFFERENT SIGNALS
We always store our information or data in some type of memory space, that set of information or data is known as signal. There are some basic signals like Rectangular pulse, Sine wave, Gaussian signals. These signals are basically use for signal processing. In signal processing there are different types of transform techniques which are used to analysis the
2
frequency spectrum of the signals. Because the frequency spectrum tells more about the signal behavior as compare to the time domain representation.
But the FRFT tell about the signal representations in time domain and frequency domain while using the different FRFT operator Fα(u), where α = 0; give the time domain representation and α = л/2; give the frequency domain representation. Also 0 < α < л/2 give the intermediates domain which are known as α–domains. These domains are not giving any exact information about the time / frequency component. But gives some mixed information about that.
So, in this section we are going to discuss about the variation of some signals with variation in α-domain.
A. Analysis of Rectangular pulse in FRFD The rectangular pulse (also known as the rectangle
function, rectangular function, gate function, or unit pulse) is defined as:
)�*���� � %�%%%%%%%%$�%+�+ , �-� %%%%%%%%%%%%%%� %.%%%%%%%%$�%+�+ / �-�
And FT of a rectangular function is defined as: �01)�*����2 � 345%��-�� ��-��6
Now, let us discuss an example for rectangular pulse in
FRFD and discuss results. In figure 2 we shows the results
(a) A=0/α=0 (b) A=0.2/α=л/10
(c) A=0.4/α=л/5 (d) A=0.6/α=3л/10
(e) A=0.8/α=2л/5 (f) A=1/α=л/2
Figure 2: FRFT of rectangular pulse for different values of angle α/A. solid line: real part. Dashed line: imaginary part.
for six different FRFD values, out of which figure 2(a) for
α = 0; which shows the rectangular pulse in time domain and figure 2(f) for α = л/2; which shows the spectrum of rectangular pulse that is sinc function, rest of the four
domains shows the FRFT results for rectangular pulse at α = л/10, л/5, 3л/10, and 2л/5.
Two FRFDs for rectangular pulse at α=0 and α= л/2 are ordinary time and frequency domains respectively. By taking a look on figure 2(a) to 2(e) any one can easily understand the concept that how an rectangular pulse become a sinc function in frequency domain, without any mathamaticaly expression. We can also see how much these domains are correlated to each other. But not tell the actual value of correlation cofficent. To analysis this in figure 3 there are two graphs first one of which tells about the normalized correlation value of α-domain signal to the time domain signal and second one tells about the normalized correlation value of α-domain to (α-1)-domain. For the better results we take 90 domains at 90 different values of A between 0 < A < 1.
In figure 3(a) and 3(b) where the α = 0 correlation coefficent has the maxium value is 1. It proof that the FRFT at α = 0 give the actual time domain signal or no rotation. But when there is a small change of 1° (one degree) in α value the correlation coefficent give the minimum value quite different from time domain signal but still correlate up to 95%, and so on. In figure 3(b) we can see that when 1° < α < 45° then the α-domain signal is highly corrrelated to the previous α-domain, an simillar result for 45°< α < 90°.
(a)
(b)
Figure 3: Correlation results for rectangular pulse.
B. Analysis of Sine wave in FRFD The sine wave or sinusoid is a mathematical function that
describes a smooth repetitive oscillation. It occurs often in pure mathematics, as well as physics, signal processing,
-30 -20 -10 0 10 20 300
0.2
0.4
0.6
0.8
1
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
0 20 40 60 80 1000.8
0.85
0.9
0.95
1
value of α in degrees
MA
X(C
orre
latio
n)
Correlation of α domain signal to time domain signal
0 20 40 60 80 1000.95
0.96
0.97
0.98
0.99
1Correlation of α domain signal to (α-1) domain signal
value of α in degrees
MA
X(C
orre
latio
n)
3
electrical engineering and many other fields. It’s most basic form as x(t) known as a function of time (t) is defined as:
7��� � 8 345����� ( 9� Where M is the amplitude of the sine wave, f is the
frequency component, t is time and θ is the phase, specifies where in its cycle the oscillation begins at t = 0.
Now, let discuss the results for Sine wave in FRFD. In figure 4 we shows the results for six α’s values, out of which figure 4(a) for α = 0; which shows the Sine wave in time domain and figure 4(f) for α = л/2; which shows the spectrum of Sine wave that is impulse function, rest of the four domains shows the FRFT results for Sine wave at α = л/10, л/5, 3л/10, and 2л/5.
As similar to the results discuss in section 3(A), now in figure 4 shows the six α-domains for Sine wave out of which two domains are identical to the ordinary time domain and frequency domain, which are in figure 4(a) and 4(b) respectively. Rest four figures 4(b), 4(c), 4(d) and 4(e) shows the results for FRFT of Sine wave at different value of α. The correlation results for Sine wave in the α-domain with the TD signal, and with the (α-1)-domain signal are shown in figure 5(a) and 5(b) respectively.
(a) A=0/α=0 (b) A=0.2/α=л/10
(c) A=0.4/α=л/5 (d) A=0.6/α=3л/10
(e) A=0.8/α=2л/5 (f) A=1/α=л/2
Figure 4: FRFT of Sine wave for different values of angle α/A. solid line: real part. Dashed line: imaginary part.
In figure 5(a) it is clear that when 1° < α <10° then the α-
domain signal for sine wave is somehow correlated to the TD signal. But when 10° < α < 90° then the α-domain signal
is not correlated to TD signal, for these domains the correlation coefficient values tends to zero.
In figure 5(b) the correlation results for α-domain to (α-1) domain are shows. When 1°< α < 90°, these domain are equally correlated to each other. But very less correlated to TD signal
(a)
(b)
Figure 5: Correlation results for Sine wave in α-domain.
C. Analysis of Gaussian signal in FRFD A Gaussian signal has a bell-shaped curve. Gaussian
tuning curves are extensively used because their analytical expression can be easily manipulated in mathematical derivations. Mathematically Gaussian signal defined as:
7��� � ����-�
As we discuss two type of signals in section 3(a) and 3(b) which are Rectangular pulse and Sine wave respectively. The third point of interest is Gaussian signal. Because Gaussian functions are widely used in statistics where they describe the normal distributions, in signal processing where they serve to define Gaussian filters, and many more application they have.
At last, by taking an example of Gaussian signal to compute the FRFT for analysis it in FRFDs. In figure 6 we show six different FRFDs for Gaussian signal. Out of which two domains are again identical to TD and FD. And rest four domains are intermediate domains of TD and FD.
For Gaussian signals the Fourier transform is again a Gaussian signals. Now if we have a look from 6(a) to 6(f) then the variation from TD to FD is easily understandable. Our point of interest is that how FRFD signals are correlated to each other. For this in figure 7 we have two plots which show the correlation of α-domain signal with TD signal and with (α-1)-domain signal in figure 7(a) and 7(b) respectively.
0 20 40 60 80 100-1
-0.5
0
0.5
1
0 20 40 60 80 100-1.5
-1
-0.5
0
0.5
1
1.5
0 20 40 60 80 100-1.5
-1
-0.5
0
0.5
1
1.5
0 20 40 60 80 100-1.5
-1
-0.5
0
0.5
1
1.5
0 20 40 60 80 100-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
0 20 40 60 80 100-5
0
5FRFT of Sine Wave A= 1
0 20 40 60 80 1000
0.2
0.4
0.6
0.8
1
1.2
1.4
value of α in degrees
MA
X(C
orre
latio
n)
Correlation of α domain signal to time domain signal
0 20 40 60 80 1000.96
0.97
0.98
0.99
1
1.01
1.02Correlation of α domain signal to (α-1) domain signal
value of α in degrees
MA
X(C
orre
latio
n)
4
(a) A=0/α=0 (b) A=0.2/α=л/10
(c) A=0.4/α=л/5 (d) A=0.6/α=3л/10
(e) A=0.8/α=2л/5 (f) A=1/α=л/2
Figure 6: FRFT of Sine wave for different values of angle α/A. solid line: real part. Dashed line: imaginary part.
(a)
(b)
Figure 7: Correlation results for Gaussian signal in α-domain.
By analyzing these three signals in FRFDs. It is clear that the α-domain signal are highly correlated to the (α-1)- domain. That can understand from figure 3(b), 5(b) and 7(b). These figures show that for the interval 1° < α < 90° these signals are similar to each other. And by taking a look to figure 2(b) to 2(e), 4(b) to 4(e) and 6(b) to 6(e), we can realized that the FRFDs signal are just scaled version of the previous FRFD.
IV. CONCLUSION
We have discussed about FRFT concept and its some properties. Also we have discussed the behavior of three different signals in FRFD, and presented these signal in FRFD. The correlation concept have discussed for α-domain signal to TD signal and to (α-1) domain signal. That shows that the α-domain signal is just the scaled version of the previous α-domain signals. That graphically proofs the scaling property of FRFT which is discussed in [6].
The work presented in this paper is helpful for further research. And the graphically proof of the scaling property of FRFT is helpful to understand that how FRFT change the time domain signal to the frequency domain signal.
REFERENCES
[1] V. Namias, “The fractional order Fourier transform and its application to quantum mechanics,” J. Inst. Math. Applicat., vol. 25, pp. 241–265, 1980.
[2] A. C. McBride and F. H. Kerr, “On Namias’ fractional Fourier transforms,” IMA J. Appl. Math., vol. 39, pp. 159–175, 1987.
[3] H. M. Ozaktas, B. Barshan, D. Mendlovic, and L. Onural, “Convolution, filtering, and multiplexing in fractional Fourier domains and their relationship to chirp and wavelet transforms,” J. Opt. Soc. Amer. A, vol. 11, pp. 547–559, Feb. 1994.
[4] R. G. Dorsch, A. W. Lohmann, Y. Bitran, and D. Mendlovic, “Chirp filtering in the fractional Fourier domain,” Appl. Opt., vol. 33, pp. 7599–7602, 1994.
[5] A. W. Lohmann and B. H. Soffer, “Relationships between the Radon–Wigner and fractional Fourier transforms,” J. Opt. Soc. Amer. A, vol. 11, pp. 1798–1801, June 1994.
[6] L. B. Almeida, “The fractional Fourier transform and time-frequency representation,” IEEE Trans. Signal Processing, vol. 42, pp. 3084–3091, Nov. 1994.
[7] Haldun M. Ozaktas, Orhan Arikan, M. A. Kutay and G. Bozdag, “Digital Computation of the Fractional Fourier Transform”, IEEE Trans. Signal Processing vol. 44, pp. 2141-2150, Sept. 1996.
Ajmer Singh (M’22) was born in Punjab, India. He is pursuing the master’s degree in signal processing from Lovely Professional University, Punjab, India, in 2011. Currently, he is doing dissertation under the supervision of Mr. Nikesh Bajaj, the assistant professor of electronic department. Research interests include different aspects of FRFD filter designing.
Nikesh Bajaj received his bachelor degree in Electronics & Telecommunication from Institute of Electronics And Telecommunication Engineers. And he received his master degree in Communication & Information System from Aligarh Muslim University, India. Now, he is working in LPU as Asst. Professor, Department of ECE. Research interests include Cryptography, Cryptanalysis,
and Signal & Image Processing.
-100 -50 0 50 1000
0.2
0.4
0.6
0.8
1
-100 -50 0 50 100-1
-0.5
0
0.5
1
1.5
-100 -50 0 50 100-1
-0.5
0
0.5
1
1.5
-100 -50 0 50 100-1
-0.5
0
0.5
1
1.5
-100 -50 0 50 100-1.5
-1
-0.5
0
0.5
1
1.5
2
-100 -50 0 50 100-1
0
1
2
3
4
0 20 40 60 80 100
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
value of α in degrees
MA
X(C
orre
latio
n)
Correlation of α domain signal to time domain signal
0 20 40 60 80 1000.997
0.9975
0.998
0.9985
0.999
0.9995
1
1.0005Correlation of α domain signal to (α -1) domain signal
value of α in degrees
MA
X(C
orre
latio
n)
1
Parzen-Cos6 (πt) combinational window family based QMF bank
Narendra Singh (*)
and Rajiv Saxena,
Jaypee University of Engineering and Technology, Raghogarh, Guna (MP)
(*) Corresponding Author: [email protected] ; [email protected]
ABSTRACT
A new approach for the design of prototype
FIR filter of two-channel Quadrature Mirror Filter
(QMF) bank is introduced. Three variable windows,
viz., Blackman window, Kaiser window, and
Parzen-cos6 (πt) (PC6) window are used to design
prototype filters. The design equations of these
window functions based filter banks are also given
in this article. Reconstruction error, which is used
as an objective function, is minimized by optimizing
the cutoff frequency of designed prototype filters.
The Gradient based iterative optimization
algorithm is used. The performances of filter banks
designed with these window functions are compared
on the basis of reconstruction error. The
combinational window, PC6 window provides the
QMF bank with better reconstruction error.
Keywords: QMF, Filter Bank, Combinational
Window.
1. INTRODUCTION
Window functions widely used in digital signal
processing for the applications in signal analysis
and estimation, digital filter design and speech
processing. Digital filter banks are used in a number
of communication applications. The theory and
design of QMF bank was first introduced by
Johnston [1]. These filter banks find wide
applications in many signal processing fields such
as trans-multiplexers [2]-[3], equalization of
wireless communication channel [4], sub-band
coding of speech and image signals [5]-[8], sub-
band acoustic echo cancellation [9]-[12]. In QMF bank the input signal x(n) splits into
two sub-band signals having equal bandwidth using the low pass and high pass analysis filter H0 (z) and
H1(z) respectively. These sub-band signals are down sampled by factor of two to reduce processing complexity. At the output, corresponding synthesis bank has two-fold interpolator for both sub-band signals, followed by G0(z) and G1(z) synthesis filters. The outputs of the synthesis filters are combined to obtain the reconstructed signal y(n). This reconstruction of signal at output is not perfect replica of the input signal x(n), due to three types of errors: aliasing error, amplitude error and phase error [12]-[13]. Since inception of the QMF banks most of the researchers giving main stress on the elimination or minimization of these errors and obtain near perfect reconstruction (NPR). In several design methods [14]-[18] aliasing and phase distortion has been eliminated completely by designing all the analysis and synthesis FIR linear phase filters by a single low pass prototype even order symmetric FIR linear phase filter. Amplitude distortion is not possible to eliminate completely, but can be minimized using optimization techniques [12]-[13]. Figure-1 shows the two - channel quadrature mirror filter bank designed by Johnston [1] in which Hanning window was used to design low pass prototype FIR filter and nonlinear optimization technique to minimize reconstruction error was employed.
This paper uses the algorithm as proposed in
Creusere and Mitra [6] with certain modifications to
optimize the objective function. The combinational
window functions [19]-[21] with large SLFOR have
been devised and used for designing FIR prototype
filters. Due to the closed form expressions of the
window functions, the optimization procedure gets
simplified. Finally, a comparative evaluation has
been done with reconstruction error and far-end
attenuation being selected as the main figure of
merit.
2
Fig. 1 Two - channel quadrature mirror filter
bank
2. FILTER DESIGN USING WINDOW
TECHNIQUE
The most straightforward technique for designing
FIR filters is to use a window function to truncate
and smooth the impulse response of an ideal zero-
phase infinite-impulse-response filter. This impulse
response can be obtained by using the Fourier series
expansion.
The impulse response of the ideal low pass
filter with cutoff frequency ωc is given as-
, sin( )
n cn
nhid n
(1)
hid(n) is doubly infinite , not absolutely
summable and therefore unrealizable [15]. Hence
shifted impulse response of hid(n) will be-
sin( ( 0.5 ))
, 0(
n N-0.5 )
1cid
n Nh
n Nn (2)
For making a causal filter, direct truncation
of infinite-duration impulse response of a filter
results in large pass band and stop band ripples near
transition band. These undesired effects are well
known Gibbs phenomenon. However, these effects
can be significantly reduced by appropriate choice
of smoothing function w (n). Hence, a filter p (n) of
order N is of the form [15-17]-
idn n np h w (3)
where, w(n) is the time domain weighting
function or window function. Window functions are
of limited duration in time domain, while
approximates band limited function in frequency
domain.
3. WINDOW FUNCTIONS
The window functions used in designing the
prototype FIR filter for the QMF banks are given in
Table-1.The Table-1includes the expressions of
variable window functions, expressions of variables
(β, γ: which are defining the window families) and
expressions of window shape parameters (D) of
Kaiser, PC6 and Blackman window. The filter
designed using one of the above window functions
is specified by three parameters cut-off frequency
(fc), filter order (N), and window shape parameter
( ). For desired stop
band attenuation (ATT) and transition bandwidth,
the order of the filter (N) can be estimated by
1,
DN
Fs
(4)
where, D is window shape parameter, ΔFs,
the normalized transition width=(fs-fp)/Fs, and Fs is
the sampling frequency in Hertz. The window shape
parameter can be determined by the desired stop
band attenuation.
3
Table 1: Window Functions with Filter Design Equations
Sr.
No
.
Name of window Expression for
Expression for Window
function
Window variable Window shape parameter
1. Blackman window 2 40.42 0.5cos 0.08cos
for M n M
n nM M
w n
2. Kaiser window 210
,( )
0
for M n M
I n Mw n
I
21 21 ,
8.7 ,
0, 21
0.5842 0.07886 f 21 50
0.1102 50
ATT ATT
ATT
for ATT
or ATT
forATT
,
0.9222, 21
( 7.95) 21
14.36
for ATT
D ATTfor ATT
3. Parzen- cos6 (πt)
combinational
window (PC6):
1 , 2 0 3.7
0, 26
l n d n n N
w n NPC
2
3
1 24 1 2 , 4
2 1
( )
, 2 4 2
l
n nn N
N N
nN n N
N
n
6( ) ( / ), 2d n cos n N n N
2a b ATT c ATT
for 30.32≤ATT≤51.25
a=8.15414,b= -
0.236709,c=0.00218617
for 51.25<ATT≤68.69
a=21.269,b= -0.605789,c=0.00434808
2D a b ATT c ATT
for 30.32 ≤ ATT ≤ 43.60 a = 1.82892, b= - 0.027548,
c = .00157699
for 43.6 < ATT ≤ 49.44
a = 1.67702, b = 0.0450505,
c = 0.00000
for 49.44 <ATT ≤57.48
a = 85.4733, b = -3.419690,
c = 0.03578400
for 57.48<ATT≤38.69
a = -8.60006, b= 0.477004,
c = -0.00355655
4
4. OPTIMIZATION ALGORITHM
The amplitude distortion in reconstructed
signal can be minimized by optimization
techniques. The gradient based iterative
optimization algorithm is described in this section.
a. Objective Function
To get the high-quality reconstructed output y(n),
the frequency response of low pass prototype filter,
H(ej2πf
), must satisfy the following [13]- 2 42 2 2| | | | 1 4 , 0 /
j f Ff sH e H e for f Fs
(5)
2| | 0 / 4
fH e forf Fs (6)
by assuming that filters have even number
of coefficients.
By satisfying exactly (5) the aliasing error is
eliminated between nonadjacent bands. Similarly,
the amplitude distortion is eliminated by satisfying
(6) [11]. Phase distortion is removed by selecting
even-order FIR prototype filter [1, 4]. Constraints
(5) and (6) cannot be satisfied exactly for finite
length filter order, so it is necessary to design a
filter which approximately satisfies (5) and (6).
Johnston [1] combined the pass band ripple energy
and out-of-band energies into a single cost function
having nonlinear nature and then minimized it using
Hooke and Jeaves algorithm [25]. Creusere and
Mitra [11] designed filters using Parks–McClellan
algorithm that approximately satisfied (5) and (6).
The filter length, relative error weighting, and stop
band edge were fixed before optimization procedure
started, while the pass-band edge was adjusted to
minimize the objective function ε.
2 /4)2 ( (2 2| ( | | ( | 1, 0 / 4
f Fj f j smax H e H e for f Fs
(7)
b. Algorithm
A gradient based linear optimization
algorithm (as given in Annexure-1) is used to adjust
the cutoff frequency. Filter design parameters and
optimization control parameters like step size (step),
target error (terror), direction (dir) and previous
error (prev-error) are initialized. Prototype filter is
designed using windowing technique. With each
iteration, fc of p(n) and reconstruction error (error)
is computed, which is also the objective function. If
the error increases in comparison to previous
iteration (prev-error), step size (step) is halved and
the search direction (dir) is reversed. This step size
and direction is used to re-compute fc for new
prototype filter. The optimization process is halted
when the error of the current iteration is within the
specified tolerance (depicted as t-error), which is
initialized before the optimization process begins or
when prev-error equals error [26].
5. PERFORMANCE ANALYSIS OF QMF
QMF banks were designed by using window
functions described in Table-1 and optimization
algorithm in Annexure-1. In these design examples
the stop-band edge frequency and pass-band edge
frequency are taken as Fs/4 and Fs/6 respectively. In
Table 2, the value of stop band attenuation was kept
at 50 dB, resulting in different filter orders for
different window functions, which clearly indicates
the improvement in reconstruction error is obtained
with PC6 window function.
In Table-3, the results corresponding to filter
order are shown. In Table-4, a comparison is made
of the optimum performance that can be attained
with the three window functions. Apart from the
reconstruction error, the far-end attenuation
(amplitude of the last ripple in the stop band) is also
selected as one of the figures of merits for the
comparative study. This parameter is of significance
when the signal to be filtered has great
concentration of spectral energy. In a sub-band
coding, the filter is intended to separate out various
frequency bands for independent processing. In the
case of speech, e.g. the far-end rejection of the
energy in the stop band should be more so that the
energy leak from one band to other will be
minimum. As the stop band attenuation increases
the value of reconstruction error decreases. It is
evident from table-2 and table-3.The PC6 window-
designed FIR filter gives better performance as
compared to the other window functions.
5
Table 2: Performance of QMF filter at 50 dB stop-band attenuation
Table 3: Optimum performance in terms of order
Window
function
Reconstructi
on error
(dB)
Stop-band
attenuation (db)
Filter
order (N)
Far-end
attenuatio
n
(dB)
Blackman
window
0.0049 108 86 102
Kaiser
window
0.0097 88.00 90 107
PC6 window 0.0120 55.00 22 72
Table 4: Performance in terms of far-end attenuation
Window
function
Reconstructi
on error
(dB)
Stop-band
attenuation (db)
Filter
order (N)
Far-end
attenuatio
n
(dB)
Blackman
window
0.0785 52.168 45 56
Kaiser
window
0.0473 52.168 37 66
PC6 window 0.0290 52.168 73 73
5. CONCLUSION
A simple algorithm for designing the low pass
prototype filters for QMF banks has been used to
optimize the reconstruction error by varying the
filter cut-off frequency. Prototype filters designed
using high SLFOR combinational window, Kaiser
window and Blackman window functions have been
compared. Combinational window functions provide
better far-end rejection of the stop-band energy. This
feature helps to reduce the aliasing energy leak into a
sub-band from that of the signal in the other sub-
band.
References
1. Johnston, J. D.: A filter family designed for
use in quadrature mirror filter banks. In:
Proceedings of IEEE International
Conference Acoustics, Speech and Signal
Processing, Denver, 291–294(1980)
2. Bellanger, M.G., Daguet, J.L.: TDM-FDM
transmultiplexer: Digital Poly phase and
FFT. IEEE Trans. Commun. 22(9) ,1199-
1204 (1974)
3. Vetterly,M.: Prefect transmultiplexers. In:
Proceedings of IEEE International
Window
function
Reconstruction error
(dB)
Filter order
(N)
Far-end
attenuation
(dB)
Blackman
window
0.6509 105 85
Kaiser
window
0.3208 90 107
PC6 window 0.1060 22 72
6
Conference on Acoustics, Speech, and Signal
Processing, vol. 4, 2567- 2570 (1986).
4. Gu, G., Badran, E.F.: Optimal design for
channel equalization via the filter bank
approach. IEEE Trans. Signal Process.52
(2),536-544 (2004)
5. Esteban, D., Galand, C.: Application of
quadrature mirror filter to split band voice
coding schemes. In: Proceedings of IEEE
International Conference on Acoustics,
Speech, and Signal Processing (ASSP), 191-
195(1977)
6. Crochiere, R.E.: Sub–band coding. Bell Syst.
Tech. J., 9, 1633-1654(1981)
7. Vrtterli, M.: Multidimensional sub-band
coding: Some theory and algorithm, Signal
Process 6, 97- 112(1984)
8. Woods,J.W.,Neil,S.D.O.:Sub-band coding of
images. IEEE Trans Acoustic. Speech and
Signal Process. (ASSP)-34 (10), 1278-
1288(1986)
9. Liu,Q.G.,Champagne,B.,Ho,D.K.C.:Simple
design of over sampled uniform DFT filter
banks with application to sub-band acoustic
echo cancellation. Signal Process, 80(5),831-
847(2000)
10. Crochiere,R.E., Rabiner , L. R.: Multirate
digital signal processing. Prentice–
Hall(1983)
11. Creusere, C.D., Mitra, S.K.: A simple
method for designing highquality prototype
filters for M band pseudo QMF banks. IEEE
Trans. Signal Process. 43(4), 1005–1007
(1995)
12. Mitra, S.K.: Digital signal processing: A
computer based Approach, TMH,
ch.7&10(2001)
13. Vaidyanathan, P.P.: Multirate systems and
filter banks. Prentice- Hall, Englewood
Cliffs, NJ (1993)
14. Jain, V.K., Crochiere,R.E.: Quadrature
mirror filter design in time domain. IEEE
Trans, Acoustic,. Speechand Signal Process.
ASSP- 329 (4), 353-361(1984)
15. H. Xu, W.S. Lu, A. Antoniou, “An improved
method for design of FIR quadrature mirror
image filter banks. IEEE Trans. Signal
Process. , 46 (6), 1275-1281(1998)
16. Goh, C. K., Lim, Y. C.: An efficient
algorithm to design weighted minimax PR
QMF banks. IEEE Trans. Signal
Process.47(12), 3303-3314)(1999)
17. Chen, C.K., Lee J.H.: Design of quadrature
mirror filters with linear phase in frequency
domain. IEEE Trans Circuits System, 39 (9),
593-605(1992)
18. Lin, Yuan-Pei, Vaidyanathan, P. P.: A Kaiser
window approach for the design of prototype
filters of cosine modulated filterbanks. IEEE
Signal Processing Lett., 5, 132–134 (1998).
19. Saxena, R.: Synthesis and characterization of
new window families with their applications,
Ph. D. Thesis, Electronics and Computer
Engineering Department, University of
Roorkee, Roorkee, India (1997).
20. Sharma, S. N., Saxena, R., Jain, A.: FIR
digital filter design with Parzen and cos6 (πt)
combinational window family, Proc. Int.
Conf. Signal Processing, Beijing, China,
IEEE Press, 92–95 (2002).
21. Sharma, S. N., Saxena, R., Saxena, S. C.:
Design of FIR filters using variable window
families – A comparative study, J. Indian
Inst. Sci., 84, 155–161 (2004).
22. DeFatta, D. J., Lucas J. G., Hodgkiss, W. S.
Digital signal processing: A system design
approach, Wiley (1988).
23. Gautam, J. K., Kumar, A., Saxena, S.C.:
WINDOWS: A tool in signal processing.
IETE Tech. Rev., vol. 12(3), 217-226
(1995).
24. Paulo, S. R. Diniz, Eduardo A. B. da Silva
and Sergio L. Netto.: Digital signal
processing: System, analysis and design,
Cambridge University Press (2003).
25. Hooke, R., Jeaves, T.: Direct search solution
of numerical and statistical problems, J.
Assoc. Comp. Machines, 8, 212–229 (1961).
26. Jain, A., Saxena, R., Saxena, S.C.: An
improved and simplified design of cosine
modulated pseudo-QMF filter banks. Digit.
Signal Process. 16(3), 225–232 (2006).
7
Annexure 1
Flowchart for gradient based optimization technique
Specify desired stop-band and pass-band ripple
Yes
No
No
Yes
step =step/2
dir=-dir
Stop
Is
|error| >|m-error|
Design prototype filter and determine reconstruction error
(|error|)
|prev-error| |error|
ωc*= ωc+(step ×dir)
Redesign prototype filter using ωc* and determine reconstruction
error (|error|)
Is |error| ≤|m-error|
or
|prev-error| =|error|
Initialize: pass-band and stop-band frequencies. m- error, step,
dir, Cut-off frequency(ωc), Filter order, and Window coefficients
1
Shivaji Sinha, Member IETE, Rachna Bhati, Dinesh Chandra, Member IEEE & IETE
email:[email protected], [email protected], [email protected]
Department of Electronics & Communication Engineering, JSSATE Noida
Abstract — A very important aspect in OFDM is time
and frequency synchronization. In particular, frequency
synchronization is the basis of the orthogonality between
frequencies. Loss in frequency synchronization is caused
due to Doppler shift because of large number of
frequencies closely spaced next to each other in OFDM
frame. So the intersymbol interference (ISI) and Inter
Carrier Interference(ICI) are also produced. This paper
presents the effects of frequency offset error in OFDM
system introduced by the fading sensitive channel.
Performance of the OFDM system is evaluated using r.m.s.
value of error across all subcarriers for different values of
the subcarrier spacing, SNR degradation and received
signal constellation in Matlab environment. The
performance is compared under various conditions of
noise variance and frequency Offset.
Index Terms— Cyclic Prefix, FFT, Frequency Offset,
ICI, IFFT , OFDM, SNR
I. INTRODUCTION
High data rate transmission is one of the major
challenges in modern communications. OFDM which is
seen as the future technology for the wireless local area
systems and used as part of the IEEE 802.11a standard,
provide high data rate transmission [1]. The need for
OFDM (Orthogonal Frequency Division Multiplexing)
system came from the idea of efficient use of spectrum
as well as bandwidth where the data transmission
becomes four times faster than the present one. OFDM
supports the technologies like DAB (Digital Audio
Broadcasting) or DVB (Digital Video broadcasting). It
is a special case of multicarrier transmission, where a
single data stream is transmitted over a number of lower
rate subcarriers. All the subcarriers within the OFDM
signal are time and frequency synchronized to each
other, allowing the interference between subcarriers to
be carefully controlled [2] [3]. In systems based on the
IEEE 802.11a standard, the Doppler effects are
negligible when compared to the frequency spacing of
more than 300 kHz. What is more important in this
situation is the frequency error caused by imperfections
in oscillators at the modulator and the demodulator.
These frequency errors cause a frequency offset
comparable to the frequency spacing, thus lowering the
overall SNR [3].
II. OFDM SYSTEM IMPLEMENTATION
In OFDM, a frequency-selective channel is subdivided
into narrower flat fading channels. Although the
frequency responses of the channels overlap with each
other as shown in Figure 1, the impulse responses are
orthogonal at the carriers, because the nulls of the each
impulse response coincides with the maximum values of
another impulse response and thus the channels can be
separated [3].
Fig 1. Orthogonality Principle
In OFDM the data are transmitted in blocks of length .
The N. The Nth data block {Xn[0],…….Xn[N-1]} is
transformed into the signal block {xn[0],…xn [N-1]} by
the IFFT as given by
.....(1)
Each frequency 2πk/N , k=,0,..., N-1 represents a
carrier.
A basic OFDM implementation scheme is
shown in Figure 2. Data at each sub-carrier (Xm) are
input into the inverse fast Fourier transform (IFFT) to
be converted to time-domain data (xm) and after parallel
to serial conversion (P/S), a cyclic prefix is added to
Performance Analysis of Sub Carrier Spacing Offset in
Orthogonal Frequency Division Multiplexing System
2
prevent ISI. At the receiver, the cyclic prefix is re-
moved, because it contains no information symbols.
After the serial-to-parallel (S/P) conversion, the
received data in the time domain (ym) are converted to
the frequency domain (Ym) using the fast Fourier
transform (FFT) algorithm.
Fig 2. OFDM System
III. FREQUENCY OFFSET & FREQUNCY
SYNCHRONIZATION ALGORITHM
The first source of frequency Offset is relative motion
between transmitter & receiver (Doppler or Frequency
drift) and is given by
……..(2)
Where fc is carrier frequency & v is relative velocity
between Transmitter & Reciver. While second source is
frequency errors in oscillator.. Single-carrier systems
are more sensitive to timing offset errors while OFDM
generally exhibits good performance in the presence of
timing errors. In practice, the frequency, which is the
time derivative of the phase, is never perfectly constant,
thereby causing ICI in OFDM receivers. One of the
destructive effects of frequency offset is loss of
orthogonality. The loss of orthogonality causes the ICI
as shown in Figure 3.
Fig 3. ICI in OFDM
The areas, colored with yellow, show the ICI. When the
centers of adjacent subcarriers are shifted because of the
frequency offset, the adjacent subcarriers nulls are also
shifted from the center of the other subcarrier. The
received signal contains samples from this shifted
subcarrier, leading to ICI [6]. The destructive effects of
the frequency offset can be corrected by estimating the
frequency offset itself and applying proper correction.
This calls for the development of a frequency
synchronization algorithm. Three types of algorithms
are used for frequency synchronization: algorithms that
use pilot tones for estimation (data-aided), algorithms
that process the data at the receiver (blind), and
algorithms that use the cyclic prefix for estimation
[4 ][5].
Among these algorithms, blind techniques are
attractive because they do not waste bandwidth to
transmit pilot tones . However, they use less information
at the expense of added complexity and degraded
performance [6]. The degradation of the SNR, Dfreq,
caused by the frequency offset, is approximated as
..….(3)
Where is the frequency offset, T is the symbol
duration in seconds , Eb is the energy per bit of the
OFDM signal and N0 is the one-sided noise power
spectrum density (PSD) [6][7] .
IV. SIMULATION PARAMETERS
First we have analyzed the impact of frequency offset
resulting in Inter Carrier Interference (ICI) while
receiving an OFDM modulated symbol. The analysis is
accompanied by Matlab simulation.
TABLE 1
R.M.S. ERROR RELATED PARAMETERS
PARAMETERS VALUES
FFT Size 64
No. of Data Subcarriers 52
No. of bits per OFDM
symbol 52
No. of symbols 1
Modulation Scheme BPSK
3
We have generated an OFDM symbol with all
subcarriers BPSK modulated then added frequency
offset with Gaussian noise of unit variance & zero mean
to result in Eb/N0 = 30 dB. We have find the difference
between the desired and actual constellation and
compute the r.m.s. value of error across all subcarriers.
This is repeated for different values of frequency offset.
The parameters are listed in Table 1.
The Parameters taken for the SNR degradation
calculation & received signal calculation are listed in
Table 2.
TABLE 2
OFDM TIMING RELATED PARAMETERS
PARAMETERS VALUES
No of Subcarriers 48
No. of Pilot Carriers 4
Total number of subcarriers
52
Subcarrier frequency spacing
0.3125 MHz
IFFT/FFT period
3.2(1/ ) s
Preamble duration
16 s
Signal duration BPSK_OFDM
symbol
4(TGI+TFFT) s
Guard interval (GI) duration
0.8(TFFT /4) s
Modulation Scheme QPSK
V. RESULTS ANALYSIS
In Figure 4 and 5 we have calculated the SNR loss for
different values of subcarrier spacing. we have seen that
the simulated results are slightly better than theoretical
results because the simulated results are computed using
average error for all subcarriers (and the subcarriers at
the edge undergo lower distortion. From figure 5 for
Eb/N0 = 30 db, the theoretical & simulated results are
overlapped at zero frequency Offset for -30 db r.m.s.
error.
-0.6 -0.4 -0.2 0 0.2 0.4 0.6-50
-40
-30
-20
-10
0
10
freqency offset/subcarrier spacing
Err
or,
dB
Error magnitude with frequency offset
theory at Eb/No=20 db
simulation at Eb/No=20 db
Fig 4. Energy Magnitude with frequency Offset at Eb/No=20db
-0.6 -0.4 -0.2 0 0.2 0.4 0.6-50
-40
-30
-20
-10
0
10
freqency offset/subcarrier spacing
Err
or,
dB
Error magnitude with frequency offset
theory at Eb/No=30 db
simulation at Eb/No=30 db
Fig 5. Energy Magnitude with frequency Offset at Eb/No=30db
Figure 6 shows the calculated degradation of the SNR
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.510
-15
10-14
10-13
10-12
10-11
10-10
10-9
10-8
10-7
frequency offset in percent
SN
R d
egra
dation (
Dfr
eq)
in d
B
17 db
15 db
10 db
5 db
Fig 6. SNR degradation of frequency offset for different Eb/N0
values
4
due to the frequency offset. For smaller SNR values, the
degradation is less than for bigger SNR values as shown
in Figure 6. In order to study the SNR degradation in
OFDM systems we have examined the received signal
with no frequency offset. In this case, the data were sent
by two of the carriers. We have generated 512 random
QPSK signals as data. We send data using only two of
the subcarriers, and the other subcarriers have no data.
Figure 7 shows that for no frequency offset & noise
variance (ideal condition), there is no ICI and no
interference between the data and the other zeros
-1.5 -1 -0.5 0 0.5 1 1.5
-1.5
-1
-0.5
0
0.5
1
1.5
real
imagin
el
Fig 7. . Received signal constellation with 0% frequency offset
-1.5 -1 -0.5 0 0.5 1 1.5
-1.5
-1
-0.5
0
0.5
1
1.5
real
imagin
el
Fig 8. Received signal constellation with 0.3% frequency offset
When 0.3% frequency offset & 0.002 noise variance is
introduced in the carrier, its effects are observed in
terms of ICI. The result with 0.3% frequency offset is
shown in Figure 8. In particular, we can see that the
signal from neighbouring carriers causes interference
and we have a distorted signal constellation at the
receiver.
When compared to Figure 8 in figure 9, it can be seen
that the received signal with 0.5% frequency offset
value for the same 0.002 noise variance is more
distorted than the received signal with 0.3% frequency
offset.
The simulation results reveal that the distortion in the
received signal is increased. which is set to zero as
shown in figure 9 & figure 10. The effects of the
frequency offset can also be observed when, data are
sent with every subcarrier, except one.
-1.5 -1 -0.5 0 0.5 1 1.5
-1.5
-1
-0.5
0
0.5
1
1.5
real
imagin
el
Fig 9. Received signal constellation with 0.5% frequency offset
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Real axis
Imagin
el axis
Fig 10. Received signal at the zero subcarrier with no
frequency offset
If we have the frequency offset in the channel, we
cannot receive a zero (no data) at the subcarrier that was
set to zero. Figure 10 shows the effect of ICI due to no
frequency offset on the subcarrier with zero data from
all other subcarriers. In the ideal case of no frequency
offset, the demodulated value should be zero for the
whole time. When frequency offset is present, the effect
is like random noise which increases with the frequency
5
offset. As shown in Figure 11, the effect of ICI
increases considerably when the frequency offset is on
the order of 0.4% - 0.6%.
When we compare the results in Figures 10 and 11, we
can see that when we increase the frequency offset
value, the received signal is distorted more and for the
frequency offset values bigger than 0.6%, the received
data are unreadable
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Real
Imagin
ary
freq. offset=0.4%
freq.offset=0.6%
Fig 11. Received signal at the zero subcarrier with 0.4% and 0.6%
frequency off-set
VI. CONCLUSION
Simulation results demonstrated the distortive effects of
frequency offset on OFDM signals; frequency offset
affects symbol groups equally. Additionally, it was seen
that an increase in frequency offset resulted in a
corresponding increase in these distortive effects and
caused degradation in the SNR of individual OFDM
symbols.
VII. FUTURE WORK
For the system developed above we can implement
three methods for frequency offset estimation: data-
driven, blind and semi-blind. The data-driven and semi-
blind rely on the repetition of data, while the blind
technique determines the frequency offset from the
QPSK data. The use of preambles & cyclic Prefix in
frequency offset estimation can also be implemented.
VIII. REFERENCES
[1] Md. Amir Ali Hasan, Faiza Nabita, Imtiaz Ahmed Amith
Khandakar “Analytical Evaluation of Timing Offset error in OFDM
system” in 2010 Second International Conference on Communication
Software and Networks.
[2] Richard Van Nee and Ramjee Prasad, OFDM for Wireless
Multimedia Communications, The Artech House Universal Personal
Communications, Norwood, MA, 2000
[3] A. Y. Erdogan, “Analysis of the Effects of Frequency Offset in
OFDM Systems,” Master’s Thesis, Naval Postgraduate School,
Monterey, California, 2004.
[4] B. Mcnair, L.J. Cimini and N. Sollenerber, “A Robust and
Frequency Offset Estimation Scheme for OFDM Systems,” AT&T
Labs- Research, New Jersey,2000
[5] Jan-Jaap van de Beek, Magnus Sandelland Per Ola B.rjesson, “ML
Estimation of Time and Frequency Offset in OFDM Systems,” In
IEEE Transactions on Signal Processing, vol. 45, no. 7, pp. 1800-
1805, July 1997.
[6] P.H. Moose, “A technique for orthogonal frequency division
multiplexing frequency-offset correction,” IEEE Trans. on Commun.,
vol. 42, no. 10, pp. 2908-2914, Oct. 1994.
[7] Ersoy Oz, “A Comparison of Timing Methods In Orthogonal
Frequency Division Multiplexing (OFDM) Systems,” Master’s thesis,
Naval Postgraduate School, Mon-terey, California, 2004.
IX. AUTHOR’S BIOGRAPHY
1.Shivaji Sinha is Asst. Prof. in J.S.S. Academy of Technical
Education, Noida since Oct. 2003. He is member of IETE. He has
done his B. Tech from G.B. Pant Engg. College Pauri, Garhwal in
Electronics & Communication Engineering & M. Tech in VLSI
design from U.P. Technical University.
2. Rachan Bhati is a student of B. Tech Final Year in JSS Aademy of
Technical Education.
3. Dinesh Chandra is Head & Professor in deptt. of Electronics &
Communication Engineering, J.S.S Academy of Technical Education,
Noida since April 2001.He is Fellow Member of IETE & Member of
IEEE. He has done his B. Tech form University of Roorkee (I.I.T.
Roorkee) in Electrical Engineering & M. Tech From I.I.T. Kharagpur
in Microwave & Optical Communication Engineering in 1987.He is
also Coordinator of M. Tech Program of G. B. technical University &
Member of Board of Studies (BOS) G.B. Technical University for
revision of Syllabus for Electronics & Communication and
Instrumentation & Control Engineering
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0301-1
A Comparative Analysis of ECG Data Compression Techniques Sugandha Agarwal
Amity School of Engineering and Technology
Amity University Uttar Pradesh,
Lucknow [email protected]
Abstract Computerized electrocardiogram (ECG),
electroencephalogram (EEG), and magneto encephalogram
(MEG) processing systems have been widely used in
clinical practice and they are capable of recording and
processing long records of biomedical signals. The need
of sending electrocardiogram records over telephone lines
for remote analysis is increasing, and so the need for
effective electrocardiogram compression techniques is
great. The aim of any biomedical signal compression
scheme is to minimize the storage space without losing any
clinically significant information, which can be achieved
by eliminating redundancies in the signal, in a reasonable
manner. The algorithms that produce better compression
ratios and less loss of data are needed. Various data
compression techniques have been proposed for reducing
the digital ECG volume for storage and transmission. Due
to the diverse procedures that have been employed,
comparison of the ECG compression method is a major
problem. The main purpose of this paper is to address
various ECG compression algorithms and determine which
would be more efficient. ECG data compression
techniques are broadly divided into two major groups:
Direct data compression and transformation methods.
Direct data reduction techniques are: Turning point,
AZTEC, CORTES, DPCM and entropy coding, fan and
SAPA, peak-picking and cycle-to-cycle compression. The
transformation method include: Fourier, cosine and K-L
transform. The paper concludes with the comparison of
some important data compression techniques. Comparison
of various ECG compression techniques like TURNING
POINT, AZTEC, CORTES, FFT and DCT it was found
that DCT is the best suitable compression technique with
compression ratio of about 100:1.
Keywords: ECG Compression;
I. INTRODUCTION
An electrocardiogram (ECG or EKG) is a graphic
representation of the heart's electrical activity, formed as
the cardiac cells depolarize and repolarise. Electrical
impulses in the heart originate in the sinoatrial node and
travel through the heart muscle where they impart
electrical initiation of systole or contraction of the heart.
The electrical waves can be measured at selectively placed
electrodes (electrical contacts) on the skin. Electrodes on
different sides of the heart measure the activity of different
parts of the heart muscle. An ECG displays the voltage
between pairs of these electrodes, and the muscle activity
that they measure, from different directions.
A typical ECG cycle is defined by the various features (P,
Q, R, S, and T) of the electrical wave. As shown in figure
1. The P wave marks the activation of the atria, which are
the chambers of the heart that receive blood from the body.
Next in the ECG cycle comes the QRS complex. The QRS
complex represents the activation of the left ventricle,
which sends oxygen-rich blood to the body, and the right
ventricle, which sends oxygen-deficient blood to the lungs.
During the QRS complex, which lasts about 80 msec, the
atria prepare for the next beat, and the ventricles relax in
the long T wave [1,2]. It is these features of the ECG
signal by which a cardiologist uses to analyze the health of
the heart and note various disorders.
Figure1. A typical representation of the ECG waves.
Digital analysis of electrocardiogram (ECG) signal
imposes a practical requirement that digitized data be
selectively compressed to minimize analysis efforts
and data storage space. Therefore, it is desirable to
carry out data reduction or data compression. The
main goal of any compression technique is to achieve
maximum data volume reduction while preserving the
significant signal features upon reconstruction.
Conceptually, data compression is the process of
detecting and eliminating redundancies in a given set.
Shannon has defined redundancy as “that fraction of
message or datum which is unnecessary and hence
repetitive in the sense that if it was missing the
message would still be essentially complete or at least
could be completed. ECG data compression is broadly
classified into two major groups: Direct data
compression and transformation method. The direct
data compressions base their detection of
redundancies on direct analysis of the actual signal
sample. Whereas transformation method utilize
spectral and energy distribution analysis for detecting
redundancies [2,7]. Data compression is achieved by
discarding digitized samples that are not important for
subsequent pattern analysis and rhythm interpretation.
Examples of such data compression algorithms are:
AZTEC, turning point (TP). AZTEC retains only the
samples for which there is sufficient amplitude
change. TP retains points where the signal curves
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0301-2
(such as at the QRS peak) and discards every alternate
sample [6]. The data reduction algorithms are
empirically designed to achieve good reduction
without causing significant distortion error.
II. SYSTEM DESCRIPTION
The acquired signal is taken and is fed to an
instrumentation amplifier that amplifies the signal. The
amplifier is used to set the gain and it also amplifies very
low amplitude ECG signal into perceptible view. The
acquisition of pure ECG signal is of higher importance. As
we know that the ECG signal will be in the range of milli-
volts, which is difficult to analyze. So the prior
requirement is to amplify the acquired signal. The
amplified output is then fed to the analog to digital
converter for digitalizing the ECG data using ADC and
microcontroller. In this process, micro-controller is used so
as to set the clocks for picking up the summation of the
signals that are generated from the heart. The heart
generates different signals at various nodes [3]. The
summation of the signals that are generated by the heart is
taken and then it is sent for filtering processes. Then the
digital output of the ECG is displayed in LCD.
Figure2. Basic block diagram of the ECG module.
After the filtering process, the signal is set for the
transmission, but it is important to compress it so as to
transmit at a faster rate. As shown in figure 2.
III.COMPRESSION TECHNIQUES
Data compression techniques are categorized as those in
which the compressed data is reconstructed to form the
original signal and techniques in which higher compression
ratios can be achieved by introducing some error in the
reconstructed signal. The effectiveness of an ECG
compression technique is described in terms of
compression ratio (CR), a ratio of the size of the
compressed data to the original data; execution time, the
computer processing time required for compression and
reconstruction of ECG data; and a measure of error loss,
often measured as the percent mean-square difference (PRD) [5]. The PRD is calculated as follows:
where ORG is the original signal and REC is the
reconstructed signal. The lower the PRD, the closer the
reconstructed signal is to the original ECG data [10,11].
The various compression techniques like AZTEC, TP,
CORTES, DFT, FFT algorithms are compared with PRD
and Compression ratio and best suitable was considered.
The amplitude zone time epoch coding algorithm
(AZTEC) converts the original ECG data into horizontal
lines (plateaus) and slopes [4]. Slopes are formed when the
length of a plateau is less than three. The information
saved from a slope is the length of the slope and its final
amplitude. The turning point technique (TP) always
produces a 2:1 compression ratio. It accomplishes this by
replacing every three data points with the two that best
represent the slope of the original three points. The
coordinate reduction time encoding system (CORTES)
combines the high compression ratios of the AZTEC
system and the high accuracy of the TP algorithm.
A. Direct Data compression Method
1. Turning Point Algorithm
1) Acquire the ECG signal
2) Take the first three samples and check for the condition
as mentioned below:
(x1-x0)*(x2-x1)<0
(or)
(x1-x0)*(x2-x1)>0
3) If the above condition-1 is correct then x1 is stored
else x2 is stored.
4) Reconstructing the compressed signal.
The compression ratio of Turning point algorithm is 2:1, if
higher compression is required then the same algorithm
can be implemented on the already compressed signal so
that it is further compressed to a ratio of 4:1. But after the
2nd compression, the required data in the signal may be
lost since the signal is overlapped on one another.
Therefore, TP algorithm is limited to compression ratio of
2:1. TP algorithm can be applied on the already
compressed data to increase the compression ratio to 4:1
[7, 13]. As shown in figure 3, The Turning Point is
basically an adaptive down sampling method developed
especially for ECGs. It reduces the sampling frequency of
an ECG signal by a factor of two.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0301-3
Figure3. Turning point compression analysis.
2. AZTEC ALGORITHM
Another commonly used technique is known as AZTEC
(Amplitude Zone Time Epoch Coding). This converts the
ECG waveform into plateaus (flat line segments) and
sloping lines. As there may be two consecutive plateaus at
different heights, the reconstructed waveform shows
discontinuities. Even though the AZTEC provides a high
data reduction ratio, the fidelity of the reconstructed signal
is not acceptable to the cardiologist because of the
discontinuity (step-like quantization) that occurs in the
reconstructed ECG waveform [12,13]. As shown in Figure
4. AZTEC Algorithm is implemented in 2 phases:
2.1. Horizontal Mode 1) Acquire the ECG signal
2) Assign the first sample to Xmax and Xmin which
represents highest and lowest elevations of the current line.
3) Check for the following condition and store the plateau
if a) If X1>Xmax then Xmax =X1 and
b) If X1<Xmin then Xmin =X1 and so on till Xn
samples, repeat this until the following 2 conditions are
satisfied, the difference between VMAX and VMIN is
greater than a predetermined threshold or if line length is
> 50 are satisfied
4) The stored values are the length L=S-1, where S is no.
of samples and L is length and the average amplitude of
the plateau (VMAX+VMIN)/2.
5) Algorithm starts assigning the next samples to Xmax
and Xmin.
2.2. Slope Mode 1) If no. of samples <=3, then the line parameters are not
saved. Instead the algorithm begins to produce slopes.
2) The direction of the slope is determined by check-ing
the following conditions.
a) If (X2 - X1) * (X1 - X0) is +ve then the slope is +ve.
b) If (X2 - X1) * (X1 - X0) is -ve then the slope is -ve.
3) The slope is terminated if the no. of samples is >=3 and
if direction of slope is changed.
Figure 4. AZTEC compression analysis.
3. CORTES Algorithm
An enhanced method known as CORTES (Coordinate
Reduction Time Encoding System) applies TP to some
portions of the waveform and AZTEC to other portions
and does not suffer from discontinuities. AZTEC line
length threshold Lth, CORTES saves the AZTEC line
otherwise it saves the TP data. As shown in Figure 5.
1) Acquire the ECG signal
2) Define the Vth and Lth.
3) Find the current Maximum and minimum.
4) If the Sample greater than threshold than compare the
length with Lth
5) If (len>lth)
AZTEC Else
TP
6) Plot the compressed signal.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0301-4
Figure 5. CORTES compression analysis.
B. Transformation Methods
1. FFT Compression
1) Separate the ECG components into three components x,
y, z.
2) Find the frequency and time between two samples.
3) Find the FFT of ECG signal check for fft coeffi-cients
(before compression) =0, increment the counter A if it is
between +25 to-25 and assign to Index=0.
4) Check for FFT coefficients (after compression) =0,
increment the Counter B.
5) Calculate inverse FFT and plot decompression, error.
6) Calculate the compression ratio, PRD.As shown in
Figure 6.
Figure 6. FFT compression analysis.
2. DCT Compression
1) Separate the ECG components into three components x,
y, z.
2) Find the frequency and time between two samples.
3) Find the dct of ECG signal check for dct coefficients
(before compression) =0, increment the counter A if it is
between +0.22 to -0.22 and assign to Index=0.
4) Check for DCT coefficients(after compression)=0,
increment the Counter B.
5) Calculate inverse dct and plot decompression, error.
6) Calculate the compression ratio, PRD. As shown in
Fig.7
Figure 7. DCT compression analysis.
IV SUMMARRY
Summary of ECG data compression schemes. The
comparison table shown in Table 1, details the resultant
compression techniques. This gives the choice to select the
best suitable compression method. From the table we
conclude that the DCT with the compression ratio of 90.43
and PRD of 0.93.Is the most efficient algorithm for ECG
data compression.
Table 1. Comparison of compression techniques.
METHOD COMPRESION
RATIO
PRD
CORTES 4.8 3.75
TURNING POINT 5 3.20
AZTEC 10.37 2.42
FFT 89.57 1.16
DCT 90.43 0.93
Graph showing compression ratio and PRD
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0301-5
CONCLUSION
Compression techniques have been around for many years.
However, there is still a continual need for the
advancement of algorithms adapted for ECG data
compression. The necessity of better ECG data
compression methods is even greater today than just a few
years ago for several reasons. The quantity of ECG records
is increasing by the millions each year, and previous
records cannot be deleted since one of the most important
uses of ECG data is in the comparison of records obtained
over a long range period of time. The ECG data
compression techniques are limited to the amount of time
required for compression and reconstruction, the noise
embedded in the raw ECG signal, and the need for accurate
reconstruction of the P, Q, R, S, and T waves.
From this paper author try to unify various data
compression techniques, used for ECG data compression
[8,9]. The results of the research will likely provide an
improvement on existing compression techniques.
REFERENCES
[1] Held, Gilbert., 1987, Data Compression : Techniques
and Applications Hardware and Software
Considerations, John Wiley & Sons Ltd.
[2] Lynch, Thomas J., 1985, Data Compression :
Techniques and Applications, Van Nostrand Reinhold
Company
[3] D. C. Reddy, (2007) biomedical signal processing-
principles and techniques, 254-300, Tata McGraw-Hill,
Third reprint.
[4] P. Abenstein and W. J. Tompkins, “New Data
Reduction Algorithm for Real-Time ECG Analysis,”
[5] Al-Nashash, H. A. M., 1994, "ECG data compression
using adaptive Fourier coefficients estimation", Med. Eng.
Phys., Vol. 16, pp. 62-67
[6] B. R. S. Reddy and I. S. N. Murthy, (1986) ECG data
compression using Fourier descriptors, IEEE Trans. Bio-
med. Eng., BME-33, 428-433.
[7] V. Kumar, S. C. Saxena, and V. K. Giri, (2006) Direct
data compression of ECG signal for telemedicine, ICSS ,
10, 45-63.
[8] Jalaleddine, C. Hutchens, R. Stratan, and W. A. Co-
berly, (1990) ECG data compression techniques-a unified
approach, IEEE Trans. Biomed. Eng., 37, 329-343.
[9] Trans. Biomed. Eng., 15, 128–129, 1968.
[10] Hamilton, Patrick S., 1991, "Compression of the
Ambulatory ECG by Average Beat Subtraction and
Residual Differencing", IEEE Transactions on Biomedical
Engineering, Vol. 38, No. 3., pp. 253-259
[11] Grauer, Ken., 1992, A Practical Guide to ECG
Interpretation, MosbyYear Book, Inc.
[12] J. R. Cox, F. M. Nolle, H. A. Fozzard, and G. C.
Oliver, (1968) AZTEC, a pre-processing program for real
time ECG rhythm analysis, IEEE Trans. Biomed. Eng.,
BME-15, 129-129.
[13] J. L. Simmlow, Bio signal and biomedical image
processing- MATLAB based applications, 4-29.
[14] N. S. Jayant, P. Noll, Digital Coding of Waveforms,
Englewood Cliffs, NJ, Prentice-Hall, 1984.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0303-1
Biologically inspired Cryptanalysis- A Review
Ashutosh Mishra*, Dr. Harsh Vikram Singh**, S.P. Gangwar**
*Student (M.Tech), KNIT SULTANPUR, ** Astt. Prof. Dept. of Electronics Engineering, KNIT SULTANPUR
Abstract:- Data security to ensure authorized access of
information and fast delivery to a variety of end users with
guaranteed Quality of Services (QoS) are important topics of
current relevance. In data security, cryptology is introduced to
guarantee the safety of data, whereby it is divided into
cryptography and cryptanalysis. Cryptography is a technique
to conceal information by means of encryption and decryption
while cryptanalysis is used to break the encrypted information
using some methods. Biological Inspired techniques (BIT) are
a method that takes ideas from biology to be used in
cryptography. BIT is a field that has been widely used in many
computer applications such as pattern recognition, computer
and network security and optimization. Some examples of BIT
approaches are genetic algorithm (GA), ant colony and
artificial neural network (ANN). GA and ant colony have been
successfully applied in cryptanalysis of classical ciphers.
Therefore, this paper will review these techniques and explore
the potential of using BIT in cryptanalysis.
Keywords: Cryptanalysis, Genetic Algorithm, Artificial
neural network, Ant Colony.
1 Introduction
There are many cryptographic algorithms (cipher) that have
been developed for information security purposes such as the
Data Encryption Standard (DES), Advanced Encryption
Standard (AES) and Rivest-Shamir-Adleman (RSA). These
are some examples of a modern cipher. The foundation of the
algorithms, especially block ciphers, is mainly based on the
concepts of a classical cipher such as substitution and
transposition. For instance, DES uses only three simple
operator namely substitution, permutation (transposition) and
bit-wise exclusive-OR (XOR) [2]. BIT is a field that has
caught the interest of many researchers. The ability of using
BIT approaches in various fields has been proven. Clark [6]
hopes for those who do research in BIT especially related to
ants, swarm and Artificial Neural Network, to examine the
application of those techniques in cryptology. He also states
that a good place to start is on classical cipher cryptanalysis or
Boolean function design. This paper is organized as follows:
first, we review simple substitution cipher, columnar
transposition cipher and permutation cipher which are types of
classical cipher, in Section 2. In Section 3, some biological
inspired techniques employed are explained and the use of
these approaches in cryptography is reviewed in Section 4.
Finally, conclusions are given in Section 5.
2 Classical Ciphers
Classical ciphers are often divided into substitution ciphers
and transposition ciphers. There are many types of these
ciphers. In this paper, we focus on simple substitution cipher
and two types of transposition cipher namely columnar
transposition cipher and permutation cipher. The ciphers are
vulnerable to cipher text-only attacks by using frequency
analysis.
Basically, a simple substitution cipher is a technique of
replacing each character with another character. The mapping
function of replacing the characters is represented by the key
used. For this purpose of study, white spaces are ignored while
other special characters like comma and apostrophe are
removed. Example 1 shows a simple substitution cipher:
Alphabet: A B C D E F G H I J K L M N O P Q R S T U V
W X Y Z
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0303-2
Key: M N F Q Y A J G R Z K B H S L C I V U D O W T E P
X
Example 1
Plain Text: - KAMLA NEHRU INSTITUTE OF
TECHNOLOGY
Cipher text: - KMHBM SYGVO RSUDRDODY LA
DYFGSLBLJP
The idea of a transposition cipher is to alter the position of a
character to another position. In columnar transposition cipher,
the plaintext is written into a table of fixed number of
columns. The number of columns depends on the length of the
key. The key represents the order of columns that will become
the cipher text. We only consider 26 characters in the
alphabet, so all special characters are removed. For example,
the plaintext “KAMLA NEHRU INSTITUTE OF
TECHNOLOGY” with the key “4726135” is transformed to
cipher text by inserting it into a table as shown in the example
in Example 2.
4 7 2 6 1 3 5
K A M L A N E
H R U I N S T
I T U T E O F
T E C H N O L
O G Y P Q R S
Example 2
Four dummy alphabets (here, P, Q, R and S) are added for
complete the rectangle and the cipher text can be written in
group of five characters [4]. So the cipher text of this cipher is
“KHITO ARTEG MUUCY LITHP ANENQ NSOOR
ETFLS”.
The permutation cipher operates by rearranging each character
in a plaintext block by block based on a key. The size of the
block is the same as the length of the key and the cipher text
can also be written in group of five characters. Using the same
plaintext and key of the previous example, the cipher text of
the permutation cipher is produced as depicted in example 3 as
follows:
Key: - plain text order: - 1 2 3 4 5 6 7
Cipher text order: - 4 7 2 6 1 3 5
Order: - 1234567 1234567 1234567 1234567 1234567
Example 3
Plain text: KAMLANE HRUINST ITUTEOF TECHNOL
GYPQRSX
Cipher Text: - LEANK MAITR SHUNT FTOIU EHLEO
TCNQX YSGPR (P, O, R, S, X, are dummy variable)
In both simple substitution and transposition cipher, there are
same disadvantage which regards to the frequency of
characters. Based on the Example 1, the character K is
replaced with K, A with M and so forth. Therefore, the
frequency of each character in the plaintext will be exactly the
same as the frequency of its corresponding cipher text
character. Hence, the encryption algorithm preserves the
frequency of characters of the plaintext in the cipher text
because it merely replaces one character with another. Still,
the frequency of characters depends on the length of the text
and probably, some characters are not even used in plaintext.
As shown in the above example, the character P, Q and R are
some characters that do not exist in the plaintext. Therefore,
many researchers use frequency analysis for cryptanalysis of
simple substitution cipher. Analyses were done by using
frequency of single character (unigram), double character
(bigram), triple character (trigram) and so on (n-grams). The
technique used to compare candidate keys to the simple
substitution cipher is to compare the frequency of n-grams of
the cipher text with the language of the text. In the effort of
attacking the transposition cipher, the multiple anagramming
attack can be used. The cipher text is written into a table
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0303-3
which the number of columns represents the length of the key.
For columnar cipher, the cipher text is written into the table
column by column from left to right while in permutation
cipher, the cipher text is written row by row from top to
bottom. After that, the columns are rearranged to form
readable plaintext in every row.
3 Biological Inspired Techniques
BIT is a method that takes ideas from biology to be used in
computing. It relies heavily on the fields of biology, computer
science and mathematics. Some of BIT approaches are GA,
artificial neural network (ANN), DNA, Cellular Automata, ant
colony, particle swarm optimization and membrane
computing. Four of these techniques namely GA, ant colony
and ANN, Cellular automata describe later in this section.
3.1 Genetic Algorithm
Genetic Algorithm (GA) is a technique that is used to optimize
searching process and was introduced by Holland in 1975 [5].
This algorithm is based on natural selection in the biological
sciences [7]. There are several processes in GA namely
selection, mating and mutation. In the beginning of the cycle,
a set of random population is created as the first generation.
Elements that make up the population are the potential
solution to the problem. The population is represented by
strings. Then, pairs of strings are selected based on a certain
criteria called a fitness function. These pairs are known as
parents and will be mated to produce children. The children
are then mutated based on a mutation rate because not all
children are mutated. After the mutation process, a new set of
population is formed (the next generation). The cycle
continues until some stopping condition is met such as a
maximum number of generations. This algorithm has been
successfully applied in cryptanalysis of classical and modern
ciphers such as simple substitution, polyalphabetic,
transposition, knapsack, rotor machine, RSA and TEA. We
will further explore the usage of this algorithm in
cryptanalysis in Section 4.
3.2 Ant Colony Optimization
Ant colony optimization is inspired by the pheromones trail
laying and following behavior of real ants which use
pheromones as a communication medium. This approach was
proposed for solving hard combinatorial optimization
problems [9]. An important aspect of ant colonies is the
collective action of many ants result in the location of the
shortest path between a food source and a nest. Standard ant
colony optimization (ACO) algorithm contains probabilistic
transition rule, goodness evolution and pheromone updating
[6]. In cryptanalysis, ACO algorithm has been applied in
breaking transposition cipher and block cipher. Cryptanalysis
of transposition cipher published in [6] is reviewed in Section
4 of this paper.
3.3 Artificial Neural Network
Artificial Neural Networks (ANN) can be defined as
computational systems inspired by theoretical immunology,
observed immune functions, principles and mechanisms in
order to solve problems [8]. ANN can be divided to
population-based algorithm such as negative selection and
clonal selection algorithm and network-based algorithm such
as continuous and discrete immune networks. ANN has been
applied to a wide variety of application areas such as pattern
recognition and classification, optimization, data analysis,
computer security and robotic [8]. Hart and Timmis et. l.
categorized these application areas and some others into three
major categories namely learning, anomaly detection and
optimization. In optimization, most of the papers published are
based on the application of clonal selection principle using the
algorithm such as Clonalg, opt-AINET and B-cell algorithm.
De Castro & Von Zuben [8] proposed a computational
implementation of the clonal selection algorithm (it is now
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0303-4
called Clonalg). The authors compared their algorithm‟s
performance with GA for multi-modal optimization and argue
that their algorithm was capable of detecting a high number of
sub-optimal solutions, including the global optimum of the
function being optimized. Castro [8] extended this work by
using immune network metaphor for multi-modal
optimization. Clonal selection has also been used in
optimization of dynamic functions. The result is compared
with evolution strategies (ES) algorithm. The comparison is
based on time and performance and shows that clonal
selection is better than ES in small dimension problems.
However, in higher dimension, ES outperformed the clonal
selection in time and performance. Other than that, somr
author applied the Clonalg in a scheduling problem, with the
name clonal selection algorithm for examination timetabling
(CSAET). The research shows that CSAET is successful in
solving problems related to scheduling. From the comparison
performed between CSAET with GA and memetic algorithm,
CSAET produced quality output as good as those algorithms.
Therefore, literature shows that ANN is capable of producing
good results in various fields especially regarding
optimization. It is hoped that ANN will also find its way in
cryptanalysis.
3.4 Cellular Automata
A cellular automaton is a decentralized computing model
providing an excellent platform for performing complex
computation with the help of only local information. Nandi et
al. presented an elegant low cost scheme for CA based cipher
system design. Both block ciphering and stream ciphering
strategies designed with programmable cellular automata
(PCA) have been reported. Recently, an improved version of
the cipher system has been proposed.
4 BIT in cryptanalysis
Classical cipher was successfully attacked using various
metaheuristic techniques. Metaheuristic is a heuristic method
for solving a very general class of computational problems.
Therefore, this technique is commonly used in combinatorial
optimization problems. Some of metaheuristic techniques that
were successfully applied in the cryptanalysis of classical
cipher are genetic algorithm, simulated annealing, tabu search
, ant colony optimization and hill climbing. In this paper, we
will review BIT techniques that have been successfully
applied in cryptanalysis of classical ciphers (simple
substitution and transposition cipher). Spillman et al have
published their paper on the cryptanalysis of simple
substitution cipher using genetic algorithm in 1993. The paper
is an early work done by using GA in cryptanalysis and it is a
good choice for re-implementation and comparison [4]. In [4],
the authors review some idea about genetic algorithm before
they show the steps on how the algorithm is applied in the
cryptanalysis. The aim of the attack is to find the possible key
values based on frequency of characters in the cipher text. The
key is sorted from the most frequent to the least frequent
characters in the English language. In the selection process,
pairs of keys (parents) are randomly selected from the
population (contains a set of keys that is randomly generated
for the first generation) based on fitness function. The fitness
function compares unigram and bigram frequencies characters
in the known language with the corresponding frequencies in
the cipher text. Keys with higher fitness value have more
chance of being selected. Mating is done by combining each
of the pairs of parents to produce a pair of children. The
children are formed by comparing every element (character) in
each pair of parents. After that, one character in the key can be
change with a randomly selected character based on a
mutation rate in the mutation process. The selection, mating
and mutation processes continue until a stopping criterion is
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0303-5
met. Another paper published in 1993 utilizing genetic
algorithm in cryptanalysis was by Matthews. However, the
paper is focuses on transposition cipher. The attack is known
as GENALYST. The attack finds the correct key length and
correct permutation of the key of a transposition cipher.
Matthews uses a list containing ten bigram and trigram yang
that have been given weight values to calculate the fitness. For
instance, the trigram „THE‟ and „AND‟ are given a score of
„+5‟ while „HE‟ and „IN‟ are given a score of „+1‟. Matthews
also give „-5‟ score for the trigram of „EEE‟. This is because,
although „E‟ is very common in English, but a word
containing a sequence of three „E‟s is very uncommon in
normal English text. Higher fitness values have more chance
of being selected. After the selection process has been done,
mating is performed using a position-based crossover method.
Then, the mutation process is applied. There are two possible
mutation types that can be applied. First, randomly swap two
elements and second, shift forward all elements by a random
number of places. The experiment was done by using
population size of 20, 25 generations and crossover decreases
from 8.0 to 0.5. The result shows that GENALYST is
successful in breaking the cipher with key lengths of 7 and 9.
Ant colony optimization has also been successfully
implemented in the cryptanalysis of transposition cipher
published in [8]. The paper uses specific ant algorithm named
Ant Colony System (ACS) with known success on the
Traveling Salesman Problem (TSP), to break the cipher. The
authors used the bigram adjacency score, Adj(I,J) to define the
average probability of the bigram created by juxtaposing
columns I and J. The score will be higher for two correctly
aligned columns. Other than that, they also used dictionary
heuristic, Dict(M) for the recognition of plaintext. The authors
also made a comparison between the results produced by ACS
with the result of previous metaheuristic techniques in
transposition cipher which involves differing heuristics,
processing time and success criteria. The comparison shows
that the ACS algorithm can decrypt cryptograms which are
significantly shorter than other methods due to the use of
dictionary heuristics in addition to bigrams.
5 Conclusion
This paper reviews works on cryptanalysis of classical ciphers
using BIT approaches. The types of classical ciphers involved
are the simple substitution and transposition cipher while GA
and ant colony optimization is the techniques used. GA has
been applied to both ciphers but only transposition cipher was
found to have been implemented using ant colony. ANN is
also discovered to be a promising approach to be employed in
cryptanalysis based on its ability to solve optimization
problems. Therefore, the application of ANN in cryptanalysis
should be further studied,
References
[1] Rsa from wikipedia. http://en.wikipedia.org/wiki/RSA.
[2]. A. Menezes, P. van Oorschot, and S. Vanstone. Handbook
of Applied Cryptography. CRC Press, New York, NY, 1997.
[3] S. Nandi, B. K. Kar, and P. Pal Chaudhuri. Theory and
applications of cellular automata in cryptography. IEEE
Transactions on Computers, 43(12):1346–1357,1994.
4]. Lin, Feng-Tse, & Kao, Cheng-Yan. (1995). A genetic
algorithm for ciphertext-only attack in cryptanalysis. In IEEE
International Conference on Systems, Man and Cybernetics,
1995, (pp. 650-654, vol. 1).
[5]. Holland, J. H. (1975). Adaptation in natural and artificial
systems. Ann Arbor: The University of Michigan Press.
[6]. Clark, J. A. (2003) Invited Paper. Natured- Inspired
Cryptography: Past, Present and Future. IEEE Conference on
Evolutionary Computation 2003. Special Session on
Evolutionary Computation and Computer Security. Canberra.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0303-6
[7]Goldberg, D., (1989) Genetic Algorithms in Search,
Optimization, and Machine Learning. Reading MA: Addison-
Wesley.
[8].de Castro, L. N. (2002) Immune, Swarm and Evolutionary
Algorithms Part I: Basic Models. International Conference on
Neural Information Processing Vol. 3 pp 1464-1468.
[9]. S.N.Sivanandam · S.N.Deepa “Introduction to Genetic
Algorithms, Springer-Verlag Berlin Heidelberg 2008.
[10] Xu Xiangyang, The block cipher for construction of S-
boxes based on particle swarm optimization, 2nd International
Conference on Networking and Digital Society (ICNDS),
2010 , Page(s): 612 - 615
[11] Uddin, M.F.; Youssef, A.M, Cryptanalysis of Simple
Substitution Ciphers Using Particle Swarm Optimization”,
IEEE Congress on Evolutionary Computation, 2006. Page(s):
677 – 680
[12] Mohammad Faisal Uddin; Amr M. Youssef , An
Artificial Life Technique for the Cryptanalysis of Simple
Substitution Ciphers , Canadian Conference on Electrical and
Computer Engineering, 2006, Page(s): 1582 - 1585
[13] Khan, S.; Shahzad, W.; Khan, F.A. , Cryptanalysis of
Four-Rounded DES Using Ant Colony Optimization
,International Conference on Information Science and
Applications (ICISA), 2010 , Page(s): 1 - 7
[14] Ghnaim, W.A.-E.; Ghali, N.I.; Hassanien, A.E., Known-
ciphertext cryptanalysis approach for the Data Encryption
Standard technique, International Conference on Computer
Information Systems and Industrial Management Applications
(CISIM), 2010 , Page(s): 600 - 603
[14] AbdulHalim, M.F.; Attea, B.A.; Hameed, S.M., A binary
Particle Swarm Optimization for attacking knapsacks Cipher
Algorithm ,International Conference on Computer and
Communication Engineering ,2008. Page(s): 77 - 81
[15] Schmidt, T.; Rahnama, H.; Sadeghian, A. , A review of
applications of artificial neural networks in cryptosystems ,
Automation Congress, 2008. WAC 2008. World , Page(s): 1 –
6
[16] Godhavari, T.; Alamelu, N.R.; Soundararajan,
R.,Cryptography Using Neural Network ,INDICON, 2005
Annual IEEE , Page(s): 258 - 261
[17] R. Spillman, M. Janssen, B. Nelson, and M. Kepner. Use
of a genetic algorithm in the cryptanalysis of simple
substitution ciphers. Cryptologia, 1993,17(1):31–44.
[18] Diffie, W. and Hellman, M. (1976). New Directions in
Cryptography. IEEE Transactions on Information Theory,
22(6): 644-654.
[19] Tarek Tadros, Abd El Fatah Hegazy, and Amr Badr
,Genetic Algorithm for DES Cryptanalysis,IJCSNS
International Journal of Computer Science and Network
Security, VOL.10 No.5, May 2010
[20]Forrest, S., Perelson, A. S. Allen, L. and Cherukuri, R.
(1994). Self-nonself Discrimination in A Computer.
Proceedings of IEEE Symposium on Research in Security and
Privacy, Los Alamos, CA. IEEE Computer Society Press.
[21] Stallings, W. (2003). Cryptography and Network
Security: Principles and Practices, 3rd
Edition. Upper Saddle
River, New Jersey: Prentice Hall.
[22] Spillman, R. (1993). Cryptanalysis of Knapsack Ciphers
Using Genetic Algorithms. Cryptologia, XVII(4):367-377.
[23] Clark, J.A. (2003). Nature-Inspired Cryptography: Past,
Present and Future. In Proceedings of Conference on
Evolutionary Computation, 8-12 December. Canberra,
Australia.
[24] Clark, A. (1998). Optimization Heuristics for Cryptology.
Ph.D. Dissertation, Faculty of Information Technology,
Queensland University of Technology, Australia.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0303-7
[25] Bagnall, A.J. (1996). The Applications of Genetic
Algorithms in Cryptanalysis. M.Sc. Thesis. School of
Information System, University of East Anglia.
[26] Dimovski, A., Gligoroski, D. (2003). Attack on the
Polyalphabetic Substitution Cipher Using a Parellel Genetic
Algorithm. Technical Report, Swiss-Macedonian Scientific
Cooperation through SCOPES Project, March 2003, Ohrid,
Macedonia.
[27] Dimovski, A., Gligoroski, D. (2003). Attacks on
Transposition Cipher Using Optimization Heuristics. In
Proceedings of ICEST 2003, October, Sofia, Bulgaria.
[28] Morelli, R.A. and Walde, R.E. (2003). A Word-Based
Genetic Algorithm for Cryptanalysis of Short Cryptograms.
Proceedings of the 2003 Florida Artificial Intelligence
Research Symposium (FLAIRS – 2003), pp. 229-233.
[29] Morelli, R.A., Walde, R.E., Servos, W. (2004). A Study
of Heuristic Search Algorithms for Breaking Short
Cryptograms. International Journal of Artificial Intelligence
Tools (IJAIT), Vol. 13, No. 1, pp. 45-64, World Scientific
Publishing Company.
[30] Servos, W. (2004). Using Genetic Algorithm to Break
Alberti Cipher. Journal of Computing Science in Colleges,
Vol. 19(5): 294-295.
[31] Hernandez, J.C., Sierra, J.M., Isasi, P., Ribagorda, A.
(2002). Genetic Cryptanalysis of Two Rounds TEA. ICCS
2002, LNCS 2331, 1024 – 1031, Springer-Verlag Berlin
Heidelberg.
[32] Ali, H. and Al-Salami, M. (2004). Timing Attack
Prospect for RSA Cryptanalysis Using Genetic Algorithm
Technique. The International Arab Journal of Information
Technology, 1(1).
[33] Millan, W., Clark, A. and Dawson, E. (1997). Smart Hill
Climbing Finds Better Boolean Functions. Proceedings of. 4th
Annual Workshop on Selected Areas in Cryptography, Aug.
11-12, SAC 1997.
[34] Millan, W., Clark, A. and Dawson, E. (1998). Heuristic
Design of Cryptographically Strong Balanced Boolean
Functions. Advances in Cryptology – EUROCRYPT ‟98,
LNCS 1403, 489-499, Springer-Verlag, Berlin Heidelberg.
[35] Dimovski, A., Gligoroski, D. (2003). Generating Highly
NonLinear Boolean Functions Using a Genetic Algorithm. In
Proceedings of 1st
Balcan Conference on Informatics,
November, Thessaloniki, Greece.
.
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0304-1
EYE BASED CURSOR MOVEMENT USING EEG IN
BRAIN COMPUTER INTERFACE Tariq S Khan
#, Mudassir Ali
#, Omar Farooq
#, Yusuf U Khan
*,
#Department of Electronics Engineering, Zakir Husain College of Engineering & Technology
*Department of Electrical Engineering, Zakir Husain College of Engineering & Technology
Aligarh Muslim University, Aligarh
Abstract— The aim of this study is to
detect eye movement (left to right) from
Electroencephalograph (EEG) signal.
Four electrodes of EEG in the frontal
area were used. The statistical features
were extracted from the four channels of
frontal channel. These features were then
fed into a classifier based on the linear
discriminator function. The most
prominent features for the classification
of left and right movements were
identified. These features were then
interfaced with computer so that cursor
movement can be controlled. Electrodes
are placed along the scalp following the
10-20 International System of Electrode
Placement. Recorded data was filtered,
windowed and analysed in order to
extract features. Four different classifiers
were used. Best results were found in
support vector machine (SVM) and linear
classifiers each of which gave the average
accuracy of 90%.
Keywords: BCI, Eye movement, EEG.
I. INTRODUCTION
A brain-computer interface (BCI)
provides an alternative communication
channel between the human brain and a
computer by using pattern recognition
methods to convert brain waves into control
signals. Patients who suffer from severe
motor impairments (severe cerebral palsy,
head trauma and spinal injuries) may use
such a BCI system as an alternative form of
communication by mental activity [1]. Using
improved measurement devices, computer
power, and software, multidisciplinary
research teams in medicine,
psychophysiology, medical engineering, and
information technology are investigating and
realizing new noninvasive methods to
monitor and even control human physical
functions.
In a bigger picture – there can be devices
that would allow severely disabled people to
function independently. For a quadriplegic,
something as basic as controlling a computer
cursor via mental commands would
represent a revolutionary improvement in
quality of life. With an EEG or implant in
place, the subject would visualize closing his
or her eyes or moving eyes from left to right
and vice versa [2]. The software can learn
eye movement through training, using
repeated trials. Subsequently, the classifier
may be used to instruct the closure/opening
of eye. A similar method is used to
manipulate a computer cursor, with the
subject thinking about forward, left, right
and back movements of the cursor [3]. With
enough practice, users can gain enough
control over a cursor to draw a circle, access
computer programs and control a television.
It could theoretically be expanded to allow
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27
2011
SIP0304-2
users to "type" with their thoughts. This can
be achieved by controlling cursor movement
on a computer screen through EEG signals
from brain, specifically, generated due to
eye movement. The signals can be analysed
by different methods.
Traditional analysis methods, such as the
Fourier Transform and autoregressive
modelling are not suitable for non-stationary
signals. Recently, wavelets have been used
in numerous applications for a variety of
purposes in various fields. It is a logical way
to represent and analyse a non-stationary
signal with variable sized region windows
and to provide local information. In the
Fourier Transform (FT), the time
information is lost and in short Term Fourier
Transform (STFT) there is limited time
frequency resolution. Even though basic
filters can be used for decomposition of
desired bands, ideal filters are never realised
in practice, which results in aliasing effects.
However, wavelet analysis enables perfect
decomposition of the desired bands, which
helps us to obtain better features [4].
In this paper different features are used
for training the classifier for eye movement
in left and right directions. A time-frequency
analysis was applied to the EEG signals
from different channels, to determine
combination of features and channels that
yielded the best classification performance.
II. BACKGROUND RESEARCH
EEG waves are created by the firing of
neurons in the brain and were first measured
by Vladimir Pravdich-Neminsky who
measured the electrical activity in the brains
of dogs in 1912, although the term he used
was ―electrocerebrogram.‖ Ten years later
Hans Berger became the first to measure
EEG waves in humans, in addition to giving
them their modern name, began what would
become intense research in utilizing these
electrical measurements in the fields of
neuroscience and psychology.
The term ―Brain-Computer Interface‖ first
appeared in scientific literature in the 1970's,
though the idea of hooking up the mind to
computers was nothing new [5]. Currently,
the systems are ―open loop‖ and responds to
user‘s thoughts only. The ―closed loop‖
systems are aimed to be developed that can
give feedback to user as well.
In order to meet the requirements of the
growing technology expansion, some kind of
standardization was required not only for the
guidance of future researchers but also for
the validation and checking of new
developments with other systems, thus a
general purpose system was developed
called BCI2000 which made analysis of
brain siganl recording easy by defining the
output formats and operating protocols to
facilitate the researchers in developing any
type of application. This made it easier to
extract specific features of brain activity and
translate them into device control signals
[7]..
III. OUR METHODOLOGY
The procedure in this study was to initially
acquire EEG data. The stored data was then
pre-processed to remove artifacts.
Subsquently features were extracted in the
clean EEG and used for classification. Thus
methodology is shown in Fig. 1.
Fig. 1: Block diagram for feature extraction and device
control of eye movement
Data acquisition
Data Processing
Feature Extraction
Classification
Device/Application Control
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27
2011
SIP0304-3
A.Experimental Setup and Data Acquisition
The subject was seated on wooden armchair
and legs were rested on wooden footrest
(wooden items should be used so as to
reduce interference) with eyes closed. The
subject was instructed to avoid speaking and
to avoid body movement in order to ensure
relaxed body. EEG data were recorded using
a Brain Tech clarityTM
system [9] with the
electrodes positioned according to the
standard 10-20 system in the biomedical
Signal Processing lab, AMU Aligarh.
To ensure the same rate of eye movement in
both directions, a ball was shown on the
screen and the subject was asked to visually
follow the ball. The movement of ball was
set to 60 pixels per second. A series of trials
were recorded.
The subject was instructed to open eyes
slowly and then to follow the movement of
the ball in the program on prompt from the
experimenter. Movement of eyes was
recorded for two different directions i.e. left
to right and right to left. Block diagram of
experimental procedure is shown in fig. 2.
Fig. 2 Sequence followed during experimental recording
B. Data Processing
26 channels of EEG were recorded. Since
only frontal lobe is mainly involved in eye
movement, only those channels that are
associated with the frontal lobe i.e. FP1-F3,
FP1-F7, FP2-F4, FP2-F8 were analysed. The
signal values associated with these signals
were extracted in ASCII form using
BrainTech software. EEG of the frontal lobe
channels for subject 1 is illustrated in Fig. 3.
Fig. 3: Plot of channels associated with frontal lobe
50 Hz power supply often causes
interference in the EEG recording. Fig. 4
shows a plot of PSD on the EEG record of
FP1-F3 channel. To eliminate these spikes
signal was passed through Infinite Impulse
Response (IIR) notch filter before analysis.
Fig. 4: Power Spectral Density of FP1F3
IIR second order notch filter with the
quality factor (or Q factor) of 3.91 was used
to remove the undesired frequency
components.
Signal after removing the artifacts of 4
channels stacked over one another is shown
in Fig. 5.
Relax
Left to right
movement Relax
Right to left
movement
50 100 150 200 250 300 350 400 450 500
0
500
1000
1500
No. of Samples
Am
pli
tud
e
Frontal lobe channels
fp1f3
fp1f7
fp2f4
fp2f7
0 20 40 60 80 100 120 140-40
-30
-20
-10
0
10
20
30
Frequency (Hz)
Po
wer
(dB
)
PSD before & after Passing Through Notch
PSD of FP2F8
PSD after passing
through Notch Filter
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27
2011
SIP0304-4
Fig. 5: Signal Plot of filtered Frontal lobe associated
channels
EEG by nature is non stationary signal. So it
was fragmented into frames so that it can be
assumed stationary for small segment. EEG
data is divided into frames of 1s duration i.e.
frame size of 256 samples.
C. Feature extraction
Feature extraction is the process of
discarding the irrelevant information to the
possible extent and representing relevant data
in a compact and meaningful form. Two eye
movements were recorded: right to left
(RTL), left to right (LTR).Standard statistical
parameters such as mean, variance,
skewness, cross-correlation were calculated
for all the channels in each movement type.
D. Classification
Following classifiers were used to classify
the two eye movements:
SVM: It is non-probabilistic binary linear
classifier.
Linear: Fits a multivariate normal density to
each group, with a pooled estimate of
covariance.
Diaglinear: Similar to 'linear', but with a
diagonal covariance matrix estimate (naive
Bayes classifiers).
Quadratic: Fits multivariate normal densities
with covariance estimates stratified by
group.
E. Cursor Control
A program was written which controls the
cursor movement according to instruction
given. This program will be calibrated
according the instructions given i.e. the
cursor movement will be invoked instead of
mouse movement as the instruction same as
that of the mouse movement. This
instruction will then be interfaced with the
eye movement which will then control the
movement of cursor [9].
IV. RESULTS AND DISCUSSIONS
For each frame of EEG, four features were
calculated namely, variance, mean, skewness
and cross correlation. The seperabrability
provided by each feature was individually
tested. The best three features were
subsequently used as an input to the
classifier. Four classifiers were used in this
work. The classifiers results are illustrated in
Table 1.For each movement of LTR and
RTL 20 seconds (20 frames) of data were
collected. From these 20 frames 15 frames
were used for training and rest 5 are used for
testing for both movements.
Table 1: Percentage accuracy of classification for eye
movements
Classifier RTL LTR
SVM 80 100
Linear 80 100
Quad 60 40
Diaglinear 80 60
From the observations in Table1 it can be
seen that linear or SVM classifier gives the
best possible results with high classification
percentage accuracy for both eye
movements.
0 100 200 300 400 500-500
0
500
1000
1500
2000Signal after passing through Notch filter
fp1f3
fp1f7
fp2f4
fp2f8
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27
2011
SIP0304-5
Fig. 6: Plot of Classifier in Signal Space
A linear classifier classifying both eye
movements is shown in Fig. 6.
Fig. 7: Variance Plot of FP2-F4
From Fig. 6 which shows the variance for the
channel FP2-F4 clearly shows that the
variance of LTR is greater than RTL for
most of the time. Variance basically shows
the concentration of probability density
function about the mean.
V. CONCLUSIONS
EEG data was investigated for two eye
movements using a 4 channel setup on three
subjects. Features were extracted from the
variance for both the movements. A linear
classifier was used to classify between the
two eye movements. These algorithms can
provide high classification accuracy only
after training for few sessions. In this work
90% of accuracy has been achieved, in
classifying the two movements (RTL &
LTR).
ACKNOWLEDGEMENT
The authors are indebted to the UGC. This work is a part of the funded major research project C.F. No 32-14/2006(SR)
.
REFERENCES
1. The "10-20 System of Electrode Placement‖ http://faculty.washington.edu/chudler/1020.html
2. Y. U. Khan,(2010) ‘Imagined wrist movement classification in
single trial EEG for brain computer interface using wavelet
packet‘, Int. J. Biomedical Engineering and Technology, Vol. 4, No. 2, pp169-180.
3. Daniel, J. Szafir (2009-10) ‗Non-Invasive BCI through EEG ―An Exploration of the Utilization of electroencephalography to Create
Thought-Based Brain-Computer Interfaces‖.
4. Wolpaw, J.R., Birbaumer, N., McFarland, D.J., Pfurtscheller, G., Vaughan, T.M. (2002): Brain–computer interfaces for communication and control. Clinical Neurophys. pp767–791
5. Y. U. Khan and O. Farooq(2009), ―Autoregressive features based classification for seizure detection using neural network in scalp
Electroencephalogram‖, International Journal of Biomedical
Engineering and Technology, vol.2, no. 4, pp. 370-381.
6. J. Vidal(1973) "Toward Direct Brain–Computer Communication." Annual Review of Biophysics and Bioengineering. Vol. 2, pp. 157-
180
7. Syed M.Siddique, Laraib Hassan Siddique (2009): EEG based Brain computer Interface: Journal of software, vol.4, no.6, pp.550-
555
8. EEG Channels in Detecting Wrist Movement Direction Intention:
Proceedings of the 2004 IEEE Conference on Cybernetics and Intelligent Systems
9. Fabiani, Georg E. et al. Conversion of EEG activity into cursor movement by a brain-computer interface.
<http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.128.5914>. 2004
10. Clarity Braintech system, Standard edition, Software version 3.4, Hardware version 1.4, Clarity Medical Private Limited
-15 -10 -5 0 5 10 15 2010
20
30
40
50
60
RTL
LTR
Support Vectors
Classifier
0 5 10 15 200
200
400
600
800
1000
Time(sec)
Va
ria
nce
Variance of FP2F4
RTL
LTR
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-1
Abstract—Due to changing trends, there is an
increasing risk of people having Cardiac
Disorders. This is the impetus behind, for
developing a system which can diagnose the
cardiac disorder and also risk level of the
patient, so that effective medication can be
taken in the initial stages. This paper helps in
comprehensive diagnosis of the patient without
the doctor in the same geographical
location.This will prove to be advantageous for
implementation in villages where doctors are
not easily accessible. In this paper, Atrial rate,
Ventricular rate, QRS Width and PR Interval
are extracted from ECG signal, so that
arrhythmia disorders- Sinus tachycardia (ST),
supra-ventricular tachycardia (SVT),
ventricular tachycardia (VT), junctional
tachycardia (JT), ventricular and Atrial
fibrillation (VF & AF) are diagnosed with their
respective risk levels. So that the system acts as
an risk analyzer, which tells how far the
subject is prone to arrhythmia. LabVIEW
signal express is used to read ECG and for
analysis this information is passed to the Fuzzy
Module. In the Fuzzy module Various ―If-then
rules‖ have been framed to identify the risk
level of the patient. The Extracted information
is then published to the client from the server
by using a Online publishing tool. After
passing the report developed by the system to
the doctor,he or she can pass the medical
advice to the server, i.e. generally the system
where the patient ECG is extracted and
analyzed.
Index Terms–LabVIEW,Arrhythmia- Sinus
tachycardia (ST), supra-ventricular tachycardia
(SVT), ventricular tachycardia (VT), junctional
tachycardia (JT), ventricular and Atrial
fibrillation (VF & AF, Online publishing tool,
QRS width, trial rate,ventricular rate
I. INTRODUCTION
According to the World Health Organization
(WHO) heart disease and stroke kills around 17 million people a year, which is almost one-third of all deaths globally. By 2020, heart disease and stroke will become the leading cause of both death and disability worldwide. So, it is very clear that proper diagnosis of heart disease is important for patients to survive. Electrocardiogram (ECG) is an important tool for Diagnosis of heart diseases .But it has some drawbacks such as: 1) Special skill is required to administer and interpret the
results of ECG.
2) Cost of ECG equipment is high.
K.A.Sunitha1, N.Senthil kumar
2, K.Prema
3, Sandeep Kotikalapudi
4
1&3 Assistant professor, Instrumentation and Control Engineering Department, SRM University,
2Professor, Mepco schlenk Engineering College, Sivakasi,
4 Student, Instrumentation and Control Engineering Department, SRM University
AN INTERNET BASED INTELLIGENT
TELEDIAGNOSIS SYSTEM FOR
ARRHYTHMIA
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-2
3) Limited availability of ECG equipment. Due to these drawbacks, telemedicine contacts were mostly used for consultations between special telemedicine centres in hospitals and clinics in the past. More recently, however, providers have begun to experiment with telemedicine contacts between health care providers and patients at home to monitor conditions such as chronic diseases [1].
LabVIEW (Laboratory Virtual Instrument Engineering Workbench) is a graphical programming environment suited for high-level or system-level design. As it has been proven that LabVIEW based telemedicine system does have the following features.
1) It replaces multiple stand-alone devices at the
cost of a single instrument using virtual instrumentation and its functionality is expandable [2].
2) It facilitates the extraction of valuable diagnostic information using embedded advanced biomedical signal processing algorithms [2].
3) It can be connected to the internet to create an internet –based telemedicine infrastructure, which provides a comfortable way for physicians to communicate with friends, family and colleagues [3].
Several systems had been developed on acquisition and analysis of ECG [4]-[8] using labVIEW . Some systems [5] and [7],[8] also dealed with identifying the cardiac disorder but it lacks , identifying the risk levels of the patient for the cardiac disorder and the online publishing system.
In this paper, we developed a program not only to access the patient’s data but also we had tried to diagnose the heart abnormalities, which can be a reference to the doctor or physician for further procedure. This can be taken up from anywhere if an internet connection is available. And a fuzzy system
is developed to identify the risk level of the patient. . Fuzzy system is more accurate than the normal controller because instead of being either true or false, a partial true case can also be declared. The risk scores can be accurately and exactly calculated for specific records of a person.
II. PROPOSED SYSTEM
Figure 1. Shows the proposed Fuzzy analyser
with online
System.
Fig 1. proposed system
The ECG waveforms are obtained from MIT-BIH Database.LabVIEW signal express is used to read and make analysis of the ECG and pass the information to the Fuzzy Module. In the Fuzzy module Various “If-then rules ” have been written to identify the risk level of the patient.The Extracted information is then published to the client from the server by using different Online publishing tools. After passing the information i.e, Atrial rate, Ventricular rate, QRS Width and PR Interval which were extracted from ECG signal,from patients system to the doctor’s system the doctor can pass the medical advice to the server, i.e. generally the system where the patient ECG is extracted and analyzed.
A. Internet based System:
The internet is used as a to and fro vehicle to deliver both the virtual medical instruments, medical data and prescription from the doctor in real time .An internet-based telemedicine system is shown in fig:2. This work involves an internet –based telemonitoring system, which has been developed as an instance of the general client-server architecture presented in fig:.
The client server architecture is defined as follows: the client application provide visualization, archiving, transmission, and contact facilities to the remote user (i.e., the patient). The server, which is
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-3
located at the physicians end takes care of the incoming data, and organizes patient sessions.
Fig.2 Internet based system
B. LABVIEW
LabVIEW is a graphical programming language
developed by National instruments. Programming
with LabVIEW gives a vivid picture of data flow by
the graphical representation in blocks. labview is
used here for getting the ECG waveform and also
for analyzing the parameters like PR interval, QRS
width, heart rates which are later passed to the fuzzy
system.
LabVIEW offers modular approach and parallel
computing , which makes easier for developing
complex systems. Debugging tools like probes,
Highlight execution are handy in analyzing where
actually the error occurred.
C. Fuzzy system
Fuzzy controllers are the widely employed as they
are efficient controllers when working with the
vague values. A Fuzzy controller has a rule base in
“IF-THEN” fashion, which is used for identification
of the risk level of disease using the weight.
A Fuzzy system is generally given by Fig 3.
Fig 3. Fuzzy system
A. Fuzzification
In this system we are considering the atrial and ventricular heart rates, QRS complex width and PR interval values as the input linguistic variables, which are passed to the inference engine.
Based on the rule base and linguistic variables, the fuzzy system output is obtained.
B. Defuzzification
The defuzzified values are the risk levels high risk, medium risk, low risk which are obtained according to the weights of fuzzy variables.
C. Relation between input and output variables
The relationship between input and output is shown by a 3-Dimensional figure 4. shown below
Fig 4. Relation between input and output
D. Fuzzy Rules
In this Fuzzy system we are using the centre of area method as the fuzzificaton method. The rule base of the fuzzy system consists of rules in the form of “If-Then”. The risk levels are dependent on the number of conditions are met by the input variables for the respective cardiac disorder. As there is no particular rule of identifying the arrhythmia based on heart rate, since it can differ from patient to patient and so this system thus is more accurate in determining the arrhythmia since it is not based only on heart rate.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-4
Fuzzy rule base is acts like a database of rules for selecting the output, basing on the input quantities. Some of the rules are:-
1. IF 'PR interval' IS 'Normal' AND 'vHR' IS '30,40' AND 'aHR' IS '60,75' THEN 'First Degree Block' IS 'No ' ALSO 'Third Degree block' IS 'Medium Risk' 2. IF 'PR interval' IS 'Normal' AND 'vHR' IS '30,40' AND 'aHR' IS '75,90' THEN 'First Degree Block' IS 'No ' ALSO 'Third Degree block' IS 'Medium Risk' 3. IF 'PR interval' IS 'Normal' AND 'vHR' IS '30,40' AND 'aHR' IS '90,100' THEN 'First Degree Block' IS 'No ' ALSO 'Third Degree block' IS 'High Risk'.
4. IF 'vHR' IS '150,180' AND 'QRS Width' IS 'Narrow QRS' THEN 'Ventricular Tachycardia at' IS 'Low risk' ALSO 'Junctional Tachycardia at' IS 'Low Risk' ALSO 'Supra Ventricular Tachy at' IS 'High Risk' 5. IF 'vHR' IS '180,210' AND 'QRS Width' IS 'Normal QRS' THEN 'Ventricular Tachycardia at' IS 'Low risk' ALSO 'Junctional Tachycardia at' IS 'High Risk' ALSO 'Supra Ventricular Tachy at' IS 'Low Risk'
In this manner, based upon the PR interval,QRS width, atrial and Ventricular heart rates a Fuzzy system is developed to identify the Cardio disorder as well as its level of risk.
III. ONLINE PUBLISHING
One of the Unique feature of this system is its ability to publish or pass the extracted information to the Client, usually to a doctor`s computer. This helps in implementing a telediagnosis system. The doctor will be able to see the diagnosis result along with risk levels and then pass the information to the doctor for further advice. Since internet issued for passing the values to the doctor ,This becomes immensely help for immediate action to be taken. This will cater to the need of public health care centres rural areas where it is difficult to have cardiologists. And also this system can be used to assist the doctor in monitoring the patient’s heart during surgery.
IV. RESULTS
This system is able to measure the arrhythmias accurately and also publish it online.
Fig 5. Block Diagram for extracting
ECG waveform
In the above Fig 5 block diagram , it perform the
function of passing the HR value obtained from the
signal express to the fuzzy system .
Fig 6. Block diagram for calling fuzzy system in labVIEW
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-5
Above figure 6. Shows the block diagram of risk level detection , we show how we called the fuzzy system into the main panel for diagnosing and risk level indication.
Fig 7 shows the Front panel which is developed from the fuzzy system ,and is sent to the doctor using web publishing tool for the second advice .System also have a database to save the details of patient like Name, Age, Sex, Symptoms which can used for the next time..
Fig 7. Front panel
V. CONCLUSION
In this way we had developed a fuzzy system with good accuracy in determining the cardiac disorders with risk levels when compared to the normal system considering the atrial and ventricular heart rates, QRS complex width and PR interval values as the input linguistic variables using labVIEW. This report is successfully sent to the doctors system using web publising tool for the second advice.
REFERENCES:-
[1] N.Noury and P.Pilichowski,”A telematic system tool for home health care,”-in proc. IEEE 14
th
Annu.Int.conf.EMBS, Paris, oct.1992, PP.1175-1177
[2] Zhenyu Guo and John c.Moulder “An internet based Telemedicine system”IEEE transactions, pp. 2000
[3].Volodymyr Hrusha, Olexandr Osolinskiy,
Pasquale Daponte, Domenico Grimaldi”Distributed
Web-based Measurement system” IEEE Workshop
on Intelligent Data and Advanced Computing
System Technology and Applications pp, on 5-7
2005
1. Acquisition and Analysis System of the ECG
Signal Based on LabVIEW by Lina Zhang,
Xinhua Jiang.
2. QRS DETECTION USING A FUZZY NEURAL
NETWORK Kevin P. Cohen, Willis J.
Tompkins, Adrianus Djohan, John G. Webster
and Yu H. Hu.
3. Classification of ECG Arrhythmias using Type-2
Fuzzy
Clustering Neural Network
4. Robust techniques for remote real time
arrhythmias classification system
5. ECG Arrhythmia Detection Using Fuzzy
Classifiers by
S. Zarei Mahmoodabadi ,A. Ahmadian, M. D.
Abolhassani, J. Alireazie P. Babyn
6. Discrimination of Cardiac Arrhythmias Using a
Fuzzy Rule-Based Method by E Chowdhury,
LC Ludeman.
7. Automated ECG Rhythm Analysis Using Fuzzy
Reasoning by W Zong, D Jiang.
8. Fuzzy Classification of Intra-Cardiac
Arrhythmias by Jodie Usher, Duncan Campbell,
Jitu Vohra, Jim Cameron.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-1
Projected View & Novel Application of Context Based Image Retrieval Techniques
Shivam Agrawal#, Rajeev Singh Chauhan*, Vivek Vyas**
#B.Tech Student, CS Department, *B.Tech Student, CS Department, **M.Tech Student, ECE Department,
Arya College of Engineering and I.T., Kukas, Jaipur,
Rajasthan Technical University, Kota #[email protected], *[email protected] and **[email protected]
Abstract— Image searching is one of the fascinating topics for the advanced research since the 1990s. As fast there is advancement in the computer and network technologies coupled with relatively cheap high volume data storage devices have brought tremendous growth in the amount of digital images, hence the development of pattern recognition is also increases exponentially. Pattern recognition is the act of taking in raw data and classifying it into predefined categories using statistical and empirical methods. Content based image retrieval (CBIR) is one of the widely used applications of pattern recognition for finding images from vast and un-annotated image database. In CBIR images are indexed on the basis of low-level features, such as color, texture, and shape, which can automatically be derived from the visual content of the images. The paper discusses techniques and algorithm that are used to extract these image features from the visual content of the images & the advancement which can be done using the CBIR. The various similarity measures are used to identify the closely associated patterns. These methods compute the distance between the features generated for different patterns and identify the closely related patterns and these patterns are then generated as the result. This paper unfolds a novel application using context based image retrieval for search the detailed description of an image without knowing a single word about it. This paper also proposes algorithms to create such a utility. Keywords: Context Based Image Retrieval, Image Searching.
INTRODUCTION
The initial techniques which are used are based on the textual annotation of the images. Using the text descriptions, images can be organized by topical or semantic hierarchies to facilitate easy navigation and browsing based on standard Boolean queries. Content Based Image Retrieval is one of the major approaches of image retrieval that has drawn significant attention in the past decade, which uses visual contents to search images from large scale image database according to users interests Low Level image features such as color, texture, shape and structure are extracted from images. Relevant images are retrieved based on the similarity of their image features. Examples of some of the prominent systems are QBIC, Photobook, and NETRA. In this paper we discuss the different algorithms used to extract the different features of an image. In this paper we also discuss the future advancement of the Context Based Image Retrieval techniques, how can be it beneficial in different fields. We also discuss the futuristic approaches to attain this technique in more advanced way.
1. Image Retrieval
A recent study of literature in image indexing and retrieval has been conducted based on 100 papers from Web of Science. Two major research approaches, text-based (description-based) and content-based, were identified. It appears that researchers in the information science community focus on the text-based approach while researchers in computer science focus on the content-based approach. Text-based image retrieval (TBIR) makes use of the text descriptors to retrieve relevant images. Some recent studies found that text descriptors such as time, location, events, objects,
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-2
formats, aboutness of image content, and topical terms are most helpful to users. The advantage of this approach was that it enabled widely approved text information retrieval systems to be used for visual retrieval systems. 1.1. Content-based image retrieval
In CBIR, the images are indexed by features that are derived directly from the images. The features are always consistent with the image and they are extracted and analyzed automatically by means of computer processing, instead of manual annotation. Due to the difficulty of automatic object recognition, information extracted from images in CBIR is rather low level, such as colors, textures, shapes, structure and combinations of the above. A number of representative generic CBIR systems have been developed in the last ten years. These systems have been implemented in different environments, some of which are Web based while some are GUI-based applications. QBIC, Photobook, and NETRA are the most prominent examples. QBIC is developed at the IBM Almaden Research Centre [1, 2, 3]. It is the first commercial CBIR application and plays an important role in the evolution of CBIR systems. The QBIC system supports low level image features of average color, color histogram, color layout, texture and shape. Additionally, users can provide pictures or draw sketches as example images in query. The visual queries can also be combined with textual keyword predicates. Photobook [4], developed at the MIT Media Lab. It is a tool for performing queries on image databases based on image content. It works by comparing features associated with images, not the images themselves. These features are in turn the parameter values of particular models fitted to each image. These models are commonly color, texture, and shape, though Photobook will work with features from any model. Features are compared using one out of a library of matching algorithms that Photobook provides. It is a set of interactive tools for searching and querying images. It is divided into three specialized systems, namely Appearance Photobook (face images), Texture Photobook, and Shape Photobook, which can also be used in combination. The features are compared by using one of the matching algorithms. These include Euclidean, Mahalanobis, divergence, vector space angle, histogram, Fourier peak, and wavelet tree distances, as well as any linear combination of those previously discussed. NETRA is a prototype image retrieval system that has been developed at them University of California, Santa
Barbara (UCSB) [5, 6]. NETRA supports features of color, texture, shape, and spatial information of segmented image regions to region-based search. Images are segmented to homogenous regions. Using the region as the basic unit, users can submit queries based on features that combine regions of multiple images. For example, a user may compose queries such as retrieve all images that contain regions having color of a region of image A, texture of a region of image B, shape of a region of image C. 1.1.1 Image features One of the main foci in CBIR is the means for extraction of the features of the images and evaluation of the similarity measurement between the features. Image features refer to the characteristics which describe the contents of an image. In this paper, image features are confined to visual features that are derived from an image directly. There have been extensive studies of various sorts of visual feature. The simplest form of visual feature is directly based on pixel values of the image. However, these types of visual feature are very sensitive to noise, brightness, hue and saturation changes, and are not invariant to spatial transformations such as translation and rotations. As a result, CBIR systems that are based on pixel values do not generally have satisfactory results. Much of the research in this area has placed the emphasis on computing useful characteristics from images using image processing and computer vision techniques. Usually, general purpose features in CBIR have included Text, color, texture, shape and Layout. Color representations Color histogram is the standard representation of color feature in CBIR system, initially investigated by Swain and Ballard. The histograms of intensity values are used to represent the color distribution. This captures the global chromatic information of an image and is invariant under translation and rotation about the view axis. Despite changes in view, change in scale, and occlusion, the histogram changes only slightly. A Color histogram H (M) of image M is a 1-D discrete function representing the Probabilities of occurrence of colors in images, which, is typically defined as: H (M) = [ ]
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-3
= k= 1, 2, 3 …. , n [Equation 1]
Where N is the number of pixels in image M and is the number of pixels with image value k. The division normalizes the histogram such that:
= 1.0 [Equation 2]
Texture representations
Many texture features have been investigated in the past, including the conventional pyramid-structured wavelet transform (PWT) features, tree-structured wavelet transform (TWT) features, the multi-resolution simultaneous autoregressive model (MR-SAR) features and the Gabor wavelet features. Experiments have been conducted and have found that the Gabor features [7, 8] produce the best performance. The computation of Gabor features is given as follows. A two dimensional Gabor function can be formulated as:
G (x, y) = ( ) × exp [- ( ) + 2�jWx]
[Equation 3] A self-similar filter dictionary can be obtained as a mother Gabor wavelet G (x, y) by appropriate dilations and rotations of Eq. (2) as:
= G ( )
Where h = height of image, w = width of image, hside = (h-1)/2; wside = (w-1)/2
= (x – hside) cos (n�/k)
+ (y – wside) sin (n�/k)
= - (x – hside) cos (n�/k)
+ (y – wside) sin (n�/k)
a > 1; m, n are integers Given an image with luminance, I (m, n), Gabor decomposition can be obtained by multiplying the luminance by the magnitude of the Gabor wavelet:
| |= I ( )
[Equation 4] The mean and standard deviation of the magnitude of the transform coefficient are used to represent the texture feature for classification and retrieval purposes:
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-4
= [Equation 5]
=
[Equation 6] The Gabor feature vector is constructed by using
and as feature components:
Where S is the number of scales and K is the number of orientation. Shape representations
APPLICATION BASED ON CONTEXT BASED IMAGE RETRIEVAL AND WORKING PROCEDURE
The one of the future advancement of the CBIR is to develop a platform for the users on which someone upload a image, query processor calculate the distance between the images of the database & according to the closeness of the images(distance between the images) it shows the related results for that image. Let suppose I am a noob for Egypt and walking into the streets of Cairo. I saw a monument, and I am eager to know about that then I just capture the image of it and upload on an application of my mobile. The application processed the query image
and shows the output in the form of the detailed information about that monument. We can create a desktop and mobile application for this purpose. There is lots of GPL & Closed License driven projects on the Image Retrieval. Tineye, gazopa are the most famous & effective project website for the image search. These Projects are using different feature extract algorithm for Context Based Image Retrieval. But, the search results provided by these websites are limited to the other images results. If we upload an image of some celebrity, we got the other similar images of that celebrity but not about that person. Here we are giving the concept of an application which works as a combination of Tineye and Wikipedia. To achieve this goal we design our web crawlers such that whenever they are indexing the images into the database it will also index the data related to that image using the Meta character and some keywords based on different algorithms apply on that page. There might be a problem that the page contains a lot of words with a single image than how can we identify that which word is exact for that image. For achieving this we follow the procedures described below: (A) First of all filter out all the unuseful words like preposition, adjective etc. from the whole text. And then apply the given algorithms for assigning the priority to remaining words. (I) Words in the Meta data contain higher priority Instead of other words on the page. (II) Words in the top 3 or 4 lines contain the higher priority after the filtration. (III) The frequently repeated word on the page contains the higher priority. (IV) Words in the bold letters contain the higher priority. (B) Now we have an Image and some words which contain the top priority from each page. (C) I upload an image to search the related images and its description. (D) The Context Based Image Searching is done to find the related images. (E) After searching, the words are also collected along with the related images of the desired Image. (F) Now one more filtering algorithm is apply for finding the exact keyword related to that image, the frequency of each word is calculated from the different results. (G) Now we assign the top priority to the word which contains the highest frequency.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-5
(H) That word goes to Wikipedia & shows the resultant description along with the query image.
CONCLUSION & FUTURE WORK
There are lots of methods for the features extraction in the Context Based Image Retrieval. We can perform as many comparison algorithms for more exact search result. Here we discuss the color, texture and shape representations in the context based image retrieval. We also discuss to generate a application using the CBIR which can play the vital role in the current generation. There is a lot of space in the future advancement of the Context Based Image Retrieval. There are lots of application can be generated which can play a vital role in different fields. There are some visual abilities which is absent from the current CBIR & there is a scope to work on like perceptual organization, similarity between semantic concepts etc.
ACKNOWLEDGMENTS
The Authors gratefully acknowledges ARYA Development and Research Center, ACEIT, Jaipur.
REFERENCES
[1] M. Flickner, H. Sawhney and W. Niblack, Query by image and video content: the QBIC system, IEEE Computer September (1995). [2] J. Hafner, H.S. Sawhney, W. Equitz, M. Flicker and W. Niblack, Efficient color histogram indexing for quadratic form distance functions, IEEE Transactions on Pattern Analysis and Machine Intelligence 17(7) (1995) 729–36. [3] W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos and G. Taubin, The QBIC project: querying images by content using colors, texture and shape. In: W. Niblack (ed.), SPIE Proceedings Vol. 1908, Storage and Retrieval for Image and Video Databases, 2–3 February 1993, San Jose, California (SPIE, San Jose, 1993) 3173–87. [4] A. Pentland, R. Picard and S. Sclaroff, Photobook: content-based manipulation of image databases. Storage and Retrieval for Image and Video Databases II, number 2185, San Jose, CA., February 1994. [5] W.Y. Ma and B.S. Manjunath, NeTra: a toolbox for navigating large image databases, Multimedia Systems 7(3) (1999) 184–198. [6] B.S. Manjunath and W.Y. Ma, Texture features for browsing and retrieval of image data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(18) (1996) 837–42. [7] C.C. Chen and C.C. Chen, Filtering methods for texture discrimination, Pattern Recognition Letters 20(8) (1999) 783–90. [8] B.S. Manjunath and W.Y. Ma, Texture features for browsing and retrieval of image data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(18) (1996) 837–42.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH
26-27 2011
SIP0402-1
Recursive Algorithm and Systolic Architecture for the Discrete Sine
Transform M.N. Murty,
Department of Physics
NIST, Berhampur-761008, Orissa, India
S.S. Nayak Department of Physics
JITM, Paralakhemundi, Orissa, India
Satyabrata Das Department of Electronics & Communication
NIST, Berhampur-761008, Orissa, India
B. Padhy Department of Physics
Khallikote (Auto) College, Berhampur-760001,
Orissa, India
S.N. Panda,
Department of Physics,
Gunupur College, Gunupur,
Orissa, India
Abstract - In this paper, a novel recursive
algorithm and a systolic architecture for
realising the discrete sine transform
(DST) are presented. By using some
mathematical techniques, any general
length DST can be converted into a
recursive equation. The recursive
algorithms apply to arbitrary length
algorithms and are appropriate for VLSI
implementation.
Keywords - discrete sine transform;
discrete cosine transform; recursive;
systolic architecture
I. INTRODUCTION
The Discrete sine transform (DST)
was first introduced to the signal processing
by Jain[1], and several versions of this
original DST were later developed by Kekre
et al.[2], Jain[3] and Wang et al.[4]. There
exist four even DST’s and four odd DST’s,
indicating whether they are an even or an
odd transform[5]. Ever since the
introduction of the first version of the DST,
the different DST’s have found wide
applications in several areas in Digital signal
processing (DSP), such as image
processing[1,6,7], adaptive digital
filtering[8] and interpolation[9]. The
performance of DST can be compared to that
of the discrete cosine transform (DCT) and it
may therefore be considered as a viable
alternative to the DCT. Yip and Rao[10]
have proven that for large sequence length
(N ≥ 32) and low correlation coefficient ( <
0.6), the DST performs even better than the
DCT.
In this paper, a novel algorithm to
convert DST into a recursive form and a
systolic architecture for parallel computation
of DST are presented. The advantage of this
algorithm is its regular structure and
parallelism, which makes it suitable for
implementation using VLSI techniques.
The rest of the paper is organised as
follows. The recursive algorithm for DST is
presented in Section-II. The comparison of
our results with other research works is
presented in Section-III. The systolic
architecture for computation of DST is
presented in Section-IV. Finally, we
conclude our paper in Section-V.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0402-2
II. THE PROPOSED RECURSIVE ALGORITHM
FOR DST
The DST of a sequence {x(n), n =
1,2,3,…, N} can be written as
N
n
knN
nxkX1
)12(2
sin)()(
for k = 1,2,3,…, N. (1)
Let N
kz , then
N
n znz
znz
nxkX1
2sin)cos(
2cos)sin(
)()( (2)
A time recursive kernel Vm for DST
is introduced as given below
Vm sin zN
mn
zmnnx )1(sin)(
(3)
=N
mn
zmnnxzmx1
)1(sin)(sin)(
=N
mn zmn
zzmnnxzmx
1 )1(sin
cos)(sin2)(sin)(
=N
mn
zmnnxzzmx1
)(sin)(cos2sin)(
zmnnxN
mn
)1(sin)(2
=
N
mn
zmnnxzzmx )(sin)(cos2sin)(
zmnnxN
mn
)1(sin)(2
= zVzVzzmx mm sinsin cos2sin)( 21
Hence, 21 cos2)( mmm VVzmxV
for m = 1,2,…, N and Vm= 0 for m >N (4)
The time recursive transfer function of
X (k) is obtained by multiplying (2) by sin z
N
nz
znz
zz
nz
nxzkX1
sin2
sin)cos(
sin2
cos)sin(
)(sin)(
N
n
nzz
zzz
nx1
)sin(2
sincossin2
cos)(
N
n
zznzznznx
1 2sinsin)cos(cos)sin()(
N
n
N
n
znz
nxnzz
nx11
)1(sin2
sin)()sin(2
sin)(
Using (3), we have
zVz
zVz
zkX sin2
sinsin2
sinsin)( 21
)(2
sin)( 21 VVN
kkX (5)
Equations (4) and (5) show that no
complex multiplication is required during
the recursive computation. Equation (5) is a
discrete time recursive transfer function of
finite duration input sequence, x(n), n = N,
N-1, …,2,1. As a consequence, X(k) is
obtained as the output of a finite impulse
response system. Fig. 1 shows the recursive
structure with the input sequence in reverse
order for the realisation of X(k).
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0402-3
Figure 1. Recursive structure for computing the DST
III. COMPARISONS WITH RELATED WORKS
The proposed approach requires N
multiplications per point, and (2N-2)
additions per point for the realisation of N
length DST.
In Tables I and II, the number of
multipliers and the number of adders in the
proposed algorithm are compared with the
corresponding parameters based on the other
methods.
Table III gives the comparison of the
computation complexities of the proposed
algorithm with other algorithms found in the
related research works.
TABLE I
COMPARISON OF THE NUMBER OF MULTIPLIERS REQUIRED BY DIFFERENT ALGORITHMS
N [11] [13] [17] [19,20,23] [21] [12] [26] [22] Proposed
4 6 5 5 4 11 2 5 4 4
8 16 13 13 12 19 8 13 8 8
16 44 35 33 32 36 30 29 16 16
32 116 91 81 80 68 54 61 32 32
64 292 227 193 192 132 130 125 64 64
TABLE II
COMPARISON OF THE NUMBER OF ADDERS REQUIRED BY DIFFERENT ALGORITHMS
N [17] [13] [19,20,23] [11] [12] [21] [26] [22] Proposed
4 9 9 9 8 4 11 14 7 6
8 35 29 29 26 22 26 26 15 14
16 95 83 81 74 62 58 50 31 30
32 251 219 209 194 166 122 98 63 62
64 615 547 513 482 422 250 194 127 126
Input sequence
x(1), …, x(n-1), x(n)
2 cos N
k
x(1), …, x(n-
1), x(n)
-1 Z
-1
Z-1
Output X(k)
sin N
k
2
x(1), …, x(n-
1), x(n)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0402-4
TABLE III
COMPUTATION COMPLEXITIES
of multiplications of additions
Proposed algorithm N 2N-2
[13] (3/4) N log2N - N + 3 (7/4) N log2N - 2N + 3
[14,16,20,23] (1/2) N log2N (3/2) N log2N - N + 1
[15,24,25] N log2N /2 + 1 3 N log2 N / 2 -N +1
[18] (1/2) N log2N + (1/4) N-1 (3/2) N log2N + (1/2) N-2
[21] 2(N+3)(N-1) / N 2(2N-1)(N-1) / N
[22] (N+1)(N-1) / N (2N+1)(N-1) / N
[26] if N is even 2N-3 3N+2
[26] if N is odd 2(N-1) 3N+4
IV. SYSTOLIC ARCHITECTURE
The structure of the proposed linear
systolic array for computation of N-point
DST is shown in Fig. 2. It consists of (N+1)
locally connected processing elements (PEs)
of which the first N PEs are identical. The
recurrence relation given by (3) is
implemented in the first N PEs, while the
last PE computes the DST components.
Function of each of the first N PEs is shown
in Fig. 3 and that of the last PE is shown in
Fig. 4. One sample of the input data is fed to
each PE, one time-step staggered with
respect to the input of previous PE in the
reverse order i.e, i th input sample is fed to
(N+1-i) th PE in (N+1-i) th time-step. The
first output is obtained after (N+1) time steps
and the rest (N-1) output are obtained in
subsequent (N-1) time-steps. However,
successive sets of N-point DSTs are obtained
in every N time-steps. Each PE of the linear
array comprises of one multiplier and two
adders, while the last PE contains one adder
and one multiplier. The duration of the cycle
period is T = TM + 2TA, where TM and TA are,
respectively, the times involved in
performing one multiplication and one
addition in the PE. This architecture requires
N multiplications per point and (2N-2)
additions per point for realisation of N-point
DST. The hardware - and time-complexities
of the proposed systolic realisation along
with those of the existing structures [27] -
[31] are listed in Table IV.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0402-5
x(n)
x(n-1)
0
2 cosz
1ST PE
2ND PE
(N-1) TH PE
N TH PE
V1
V2
(N+1) TH PE [S]
0 OUTPUT
0
Figure 2 . The linear systolic array for N- Point DST
2 cosz = 2 cos N
k
2
k k+1 in each time - step
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0402-6
xin
ain
PE
aout
bin bout
cin cout
aout =ain
bout = xin + ain bin - cin
cout = bin
xin = Input sample
Figure 3. Function of each of the first N PEs of the linear array
Vout = (V1in + V2 in)S
S = sin N
k
2
k = 1 for first (N+1) time - steps. Then k k+1 in each time - step.
Figure 4. Function of (N+1)th PE of the linear array.
TABLE IV
HARDWARE - AND TIME-COMPLEXITIES OF PROPOSED STRUCTURE AND THE EXISTING SYSTOLIC STRUCTURES
FOR THE DST / DCT
Structures Multipliers Adders Cycle-Time (T) Average Computation
- Time
Pan and Park [27] N 2N TM + TA NT/2
Fang and Wu [28] N/2 + 3 N + 3 TM + 2TA NT
Chiper et al. [29] N-1 N+1 TM + TA (N-1) T/2
Meher [30] N/2 - 1 N/2 + 9 2(TM + TA) (N/4-1) T
Meher [31] N/2 + 3 N/2 +5 TM + TA (N/2-1) T
Proposed N 2N - 2 TM + 2TA (N+1) T
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0402-7
V. CONCLUSION
In this paper, we proposed a
recursive algorithm, which is most suitable
for parallel computation of the DST. It
involves significantly less number of
multipliers and adders compared with that of
the existing structures. The proposed systolic
architecture is parallel, simple and regular,
which is suitable for VLSI implementation.
REFERENCES
[1] A.K. Jain, “A fast Karhunen-Loeve
transform for a class of random
processes,” IEEE Trans. Commun., vol.
COM-24, pp 1023-1029, September
1976.
[2] H.B. Kekre and J.K. Solanka,
“Comparative performance of various
trigonometric unitary transforms for
transform image coding,” Int. J.
Electron., vol. 44, pp 305-315, 1978.
[3] A.K. Jain, “A sinusoidal family of
unitary transforms,” IEEE Trans. Patt.
Anal. Machine Intell., vol. PAMI-I, pp
356-365, September 1979.
[4] Z. Wang and B. Hunt, “The discrete W
transform,” Applied Math Computat.,
vol. 16, pp 19-48, January 1985.
[5] S. Poornachandra, V. Ravichandran and
N.Kumarvel, “Mapping of discrete
cosine transform (DCT) and discrete
sine transform (DST) based on
symmetries” IETE Journal of Research,
Vol. 49, no. 1, pp 35-42, January-
February 2003.
[6] S. Cheng, “Application of the sine
transform method in time of flight
positron emission image reconstruction
algorithms,” IEEE Trans. BIOMED.
Eng., vol. BME-32, pp 185-192, March
1985.
[7] K. Rose, A. Heiman and I. Dinstein,
“DCT/DST alternate transform image
coding,” Proc. BLOBE COM 87, vol. I,
pp. 426-430, November 1987.
[8] J.L. Wang and Z.Q. Ding, “Discrete
sine transform domain LMS adaptive
filtering,” Proc. Int. Conf. Acoust.,
Speech, Signal Processing, pp 260-263,
1985.
[9] Z. Wang and L. Wang, “Interpolation
using the fast discrete sine transform,”
Signal Processing, vol. 26, pp 131-137,
1992.
[10] P. Yip and K.R. Rao, “On the
computation and the effectiveness of
discrete sine transform”, Comput.
Electron., vol. 7, pp. 45-55, 1980.
[11] W.H. Chen, C.H. Smith and S.C.
Fralick, “A fast computational
algorithm for the discrete cosine
transform”, IEEE Trans.
Communicat., vol. COM-25, no. 9,
pp. 1004-1009, Sep. 1977.
[12] P. Yip and K.R. Rao, “A fast
computational algorithm for the
discrete sine transform”, IEEE Trans.
Commun., vol. COM-28, pp. 304-
307, Feb. 1980.
[13] Z. Wang, “Fast algorithms for the
discrete W transform and for the
discrete Fourier transform”, IEEE
Trans. Acoust., Speech, Signal
Processing, vol. ASSP-32, pp. 803-
816, Aug. 1984.
[14] P. Yip and K.R. Rao, “Fast
decimation-in-time algorithms for a
family of discrete sine and cosine
transforms”, Circuits, Syst., Signal
Processing, vol. 3, pp. 387-408,
1984.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0402-8
[15] H.S. Hou, “A fast recursive
algorithm for computing the discrete
cosine transform”, IEEE Trans.
Acoust., Speech, Signal Processing,
vol. ASSP-35, no. 10, pp. 1455-
1461, Oct. 1987.
[16] O. Ersoy and N.C. Hu, “A unified
approach to the fast computation of
all discrete trigonometric
transforms,” in Proc. IEEE Int. Conf.
Acoust., Speech, Signal Processing,
pp. 1843-1846, 1987.
[17] H.S. Malvar, “Corrections to fast
computation of the discrete cosine
transform and the discrete hartley
transform,” IEEE Trans. Acoust.,
Speech, Signal Processing, vol. 36,
no. 4, pp. 610-612, Apr. 1988.
[18] P. Yip and K.R. Rao, “The
decimation-in-frequency algorithms
for a family of discrete sine and
cosine transforms”, Circuits, Syst.,
Signal Processing, vol. 7, no. 1, pp.
3-19, 1988.
[19] A. Gupta and K.R. Rao, “A fast
recursive algorithm for the discrete
sine transform” IEEE Transactions
on Acoustics, Speech and Signal
Processing, vol. 38, no. 3, pp. 553-
557, March, 1990.
[20] Z. Cvetković and M.V. Popović,
“New fast recursive algorithms for
the computation of discrete cosine
and sine transforms”, IEEE Trans.
Signal Processing, vol. 40, no. 8, pp.
2083-2086, Aug. 1992.
[21] J. Caranis, “A VLSI architecture for
the real time computation of discrete
trigonometric transform”, J. VLSI
Signal Process., no. 5, pp. 95-104,
1993.
[22] L.P. Chau and W.C. Siu, “Recursive
algorithm for the discrete cosine
transform with general lengths”,
Electronics Letters, vol. 30, no. 3,
Feb. 1994.
[23] Peizong Lee and Fang-Yu Huang,
“Restructured recursive DCT and
DST algorithms”, IEEE Transactions
on Signal Processing,” vol. 42, no.
7, pp. 1600-1609, July 1994.
[24] V. Britanak, “On the discrete cosine
computation”, Signal Process., vol.
40, no. 2-3, pp. 183-194, 1994.
[25] C.W. Kok, “Fast algorithm for
computing discrete cosine
transform”, IEEE Trans. Signal
Process., vol. 45, pp. 757-760, Mar.
1997.
[26] V. Kober, “Fast recursive algorithm
for sliding discrete sine transform”,
Electronics Letters, vol. 38, no. 25,
pp. 1747-1748, Dec. 2002.
[27] S.B. Pan and R.H. Park, “Unified
systolic array for computation of
DCT / DST / DHT”, IEEE Trans.
Circuits Syst. Video Technol., vol. 7,
no. 2, pp.413-419, April 1997.
[28] W.H. Fang and M.L. Wu, “Unified
fully-pipelined implementations of
one- and two-dimensional real
discrete trigonometric trnasforms”,
IEICE Trans. Fund. Electron.
Commun. Comput. Sci., vol. E82-A,
no. 10, pp. 2219-2230, Oct. 1999.
[29] D.F. Chiper, M.N.S. Swamy, M.O.
Ahmad, and T. Stouraitis, “A systolic
array architecture for the discrete
sine transform”, IEEE trans. Signal
Process., vol. 50, no. 9, pp. 2347 -
2354, Sept. 2002.
[30] P.K. Meher, “A new convolutional
formulation of the DFT and efficient
systolic implementation”, in Proc.
IEEE Int. Region 10 Conf.
(TENCON’05), pp. 1462-1466, Nov.
2005.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0402-9
[31] P.K. Meher, “Systolic designs for
DCT using a low-complexity
concurrent convolutional
formulation”, IEEE Trans. Circuits &
Systems for Video Technology, vol
16, no. 9, pp. 1041-1050, Sept. 2006.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0403-1
Multiscale Edge Detection Based on Wavelet Transform
Divesh Kumar, Dr. Yaduvir Singh
Department of Electrical and Instrumentation Engineering
Thapar University, Patiala, Punjab
[email protected], [email protected]
Abstract: This paper presents a new approach
to edge detection using wavelet transforms.
First, we briefly introduce the development of
wavelet analysis. Then, some major classical
edge detectors are reviewed and interpreted
with continuous wavelet transforms. The
classical edge detectors work fine with high-
quality pictures, but often are not good enough
for noisy pictures because they cannot
distinguish edges of different significance. The
proposed wavelet based edge detection
algorithm combines the coefficients of wavelet
transforms on a series of scales and
significantly improves the results. Finally, a
cascade algorithm is developed to implement
the wavelet based edge detector.
Keywords: wavelet transform, canny edge
detector, sobel edge detector, noise.
INTRODUCTION
An edge in an image is a contour
across which the brightness of the image
changes abruptly. In image processing, an
edge is often interpreted as one class of
singularities. In a function, singularities can be
characterized easily as discontinuities where
the gradient approaches infinity. However,
image data is discrete, so edges in an image
often are defined as the local maxima of the
gradient. This is the definition we will use here.
Edge detection is an important task in image
processing. It is a main tool in pattern
recognition, image segmentation, and scene
analysis. An edge detector is basically a high
pass filter that can be applied to extract the
edge points in an image. This topic has
attracted many researchers and many
achievements have been made [14]-[20]. In
this paper, we will explain the mechanism of
edge detectors from the point of view of
wavelets and develop a way to construct edge
detection filters using wavelet transforms.
Many classical edge detectors have
been developed over time. They are based on
the principle of matching local image segments
with specific edge patterns. The edge
detection is realized by the convolution with a
set of directional derivative masks [21]. The
popular edge detection operators are Roberts,
Sobel, Prewitt, Frei-Chen, and Laplacian
operators ( [17], [18], [21], [22] ). They are all
defined on a 3 by 3 pattern grid, so they are
efficient and easy to apply. In certain situations
where the edges are highly directional, some
edge detector works especially well because
their patterns fit the edges better.
Noise and its influence on edge detection
However, classical edge detectors
usually fail to handle images with strong noise,
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0403-2
as shown in Fig. 1.1. Noise is unpredictable
contamination on the original image. It is
usually introduced by the transmission or
compression of the image.
(a) Lena image (b) Edges using Canny
(c) Image with noise (d) Edges from the image with
noise
Fig. 1.1: Impact of noise on edge detection
There are various kinds of noise, but
the most widely studied two kinds are white
noise and “salt and pepper” noise. Fig. 1.1
shows the dramatic difference between the
result of edge detection from two similar
images, with the later one affected by some
white noise.
Review of Classical Edge Detectors
Classical edge detectors use a pre-
defined group of edge patterns to match each
image segments of a fixed size. 2-D discrete
convolutions are used here to find the
correlations between the pre-defined edge
patterns and the sampled image segment.
( * )( , ) ( , ) ( , ),i j
f m x y f i j m x i y j ..........(1.
1)
Where f is the image and m is the edge pattern
defined by
M (i, j) = 0, if (i, j) is not in the grid
These patterns are represented as filters,
which are vectors (1-D) or matrices (2-D). For
fast performance, usually the dimension of
these filters are 1×3 (1-D) or 3×3 (2-D). From
the point of view of functions, filters are
discrete operators of directional derivatives.
Instead of finding the local maxima of the
gradient, we set a threshold and consider
those points with gradient above the threshold
as edge points. Given the source image f(x,y),
the edge image E(x,y) is given by
(1.2)
Where s and t are two filters of different
directions.
Roberts edge detector
The edge patterns are shown in Fig.1.2
(a) (b)
Fig. 1.2: Edge patterns for Roberts edge detector:(a) s; (b)
t
These filters have the shortest
support, thus the position of the edges is more
accurate. On the other hand, the short support
of the filters makes it very vulnerable to noise.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0403-3
The edge pattern of this edge detector makes
it especially sensitive to edges with a slope
around π/4. Some computer vision programs
use the Roberts edge detector to recognize
edges of roads.
Prewitt edge detector
The edge patterns are shown in Fig. 1.3
(a) (b)
Fig. 1.3: Edge patterns for Prewitt and Sobel edge
detectors: (a)s; (b)t
These filters have longer support.
They differentiate in one direction and average
in the other direction. So the edge detector is
less vulnerable to noise. However, the position
of the edges might be altered due to the
average operation.
Sobel edge detector
The edge patterns are similar to those
of the Prewitt edge detector as shown in Fig.
1.3. These filters are similar to the Prewitt
edge detector, but the average operator is
more like a Gaussian, which makes it better for
removing some white noise.
Frei-Chen edge detector
A 3×3 sub image b of an image f may
be thought of as a vector in R9. For example,
Let V denote the vector space of 3 × 3
sub images. Bv, an orthogonal basis for V, is
used for the Frei-Chen method. The subspace
E of V that is spanned by the sub images v1,
v2, v3, and v4 is called the edge subspace of
V. The Frei-Chen edge detection method
bases its determination of edge points on the
size of the angle between the sub image b and
its projection on the edge subspace.
(1.3)
The edge patterns are shown in fig. 1.4
(g) (h) (i)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0403-4
Fig. 1.4: Edge Patterns for the Frei-Chen edge
detector: (a) v1; (b) v2; (c) v3; (d) v4; (e) v5;
(f) v6; (g) v7; (h) v8; (i) v9.
As shown in above patterns, the sub
images in the edge space are typical edge
patterns with different directions; the other sub
images resemble lines and blank space.
Therefore, the angle θE is small when the sub
image contains edge-like elements, and θE is
large otherwise.
Canny edge detection
Canny edge detection [4] is an
important step towards mathematically solving
edge detection problems. This edge detection
method is optimal for step edges corrupted by
white noise. Canny used three criteria to
design his edge detector. The first requirement
is reliable detection of edges with low
probability of missing true edges, and a low
probability of detecting false edges. Second,
the detected edges should be close to the true
location of the edge. Lastly, there should be
only one response to a single edge. To
quantify these criteria, the following functions
are defined:
Where A is the amplitude of the signal and
(n0)2 is the variance of noise. SNR(f) defines
the signal-to-noise ratio and Loc(f) defines the
localization of the filter f(x). Now, by scaling f
to fs, we get the following “uncertainty
principle”:
That is, increasing the filter size increases the
signal-to-noise ratio but also decreases the
localization by the same factor. This suggests
maximizing the product of the two. So the
object function is defined as:
(1.4)
Where f(x) is the filter for edge detection. The
optimal filter that is derived from these
requirements can be approximated with the
first derivative of the Gaussian filter,
The choice of the standard deviation for the
Gaussian filter, σ, depends on the size, or
scale, of the objects contained in the image.
For images with multiple size objects, or
unknown size one approach is to use Canny
detectors with different σ values. The outputs
of the different Canny filters are combined to
form the final edge image.
Development of wavelet analysis
The concept of wavelet analysis has been
developed since the late 1980’s. However, its idea
can be traced back to the Littlewood-Paley
technique and Calderón-Zygmund theory [25] in
harmonic analysis. Wavelet analysis is a powerful
tool for time-frequency analysis. Fourier analysis is
also a good tool for frequency analysis, but it can
only provide global frequency information, which
is independent of time. Hence, with Fourier
analysis, it is impossible to describe the local
properties of functions in terms of their spectral
properties, which can be viewed as an expression
of the Heisenberg uncertainty principle [13].
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0403-5
In many applied areas like digital signal processing,
time-frequency analysis is critical. That is, we want
to know the frequency properties of a function in a
local time interval. Engineers and mathematicians
developed analytic methods that were adapted to
these problems, therefore avoiding the inherent
difficulties in classical Fourier analysis. For this
purpose, Dennis Gabor introduced a “sliding-
window” technique. He used a Gaussian function g
as a “window” function, and then calculated the
Fourier transform of a function in the “sliding
window”. The analyzing function is
The Gabor transform is useful for time-frequency
analysis. The Gabor transform was later
generalized to the windowed Fourier transform in
which g is replaced by a “time local” function
called the “window” function. However, this
analyzing function has the disadvantage that the
spatial resolution is limited by the fixed size of the
Gaussian envelope [13]. In 1985, Yves Meyer
([23], [24]) discovered that one could obtain
orthonormal bases for L2(R) of the type
and that the expression
for decomposing a function into these orthonormal
wavelets converged in many function spaces.
Themost preeminent books on wavelets are those
ofMeyer ([23], [24]) and Daubechies. Meyer
focuses on mathematical applications of wavelet
theory in harmonic analysis; Daubechies gives a
thorough presentation of techniques for
constructing wavelet bases with desired properties,
along with a variety of methods for mathematical
signal analysis [14]. A particular example of an
orthonormal wavelet system was introduced by
Alfred Haar. However, the Haar wavelets are
discontinuous and therefore poorly localized in
frequency. Stéphane Mallat made a decisive step in
the theory of wavelets in 1987 when he proposed a
fast algorithm for the computation of wavelet
coefficients. He proposed the pyramidal schemes
that decompose signals into subbands. These
techniques can be traced back to the 1970s when
they were developed to reduce quantization noise.
The framework that unifies these algorithms and
the theory of wavelets is the concept of a multi-
resolution analysis (MRA). AnMRA is an
increasing sequence of closed, nested subspaces
{Vj}j∈Z that tends to L2(R) as j increases. Vj is
obtained from Vj+1 by a dilation of factor 2. V0 is
spanned by a function φ that satisfies
(1.6)
Equation (1.6) is called the “two-scale equation”,
and it plays an essential role in the theory of
wavelet bases.
Edge detector using wavelets
Now that we have talked briefly about the
development of edge detection techniques and
wavelet theories, we next discuss how they are
related. Edges in images can be mathematically
defined as local singularities. Until recently, the
Fourier transforms was the main mathematical tool
for analyzing singularities. However, the Fourier
transform is global and not well adapted to local
singularities. It is hard to find the location and
spatial distribution of singularities with Fourier
transforms. Wavelet analysis is a local analysis, it
is especially suitable for time-frequency analysis
[1], which is essential for singularity detection.
This was a major motivation for the study of the
wavelet transform in mathematics and in applied
domains. With the growth of wavelet theory, the
wavelet transforms have been found to be
remarkable mathematical tools to analyze the
singularities including the edges, and further, to
detect them effectively. This idea is similar to that
of John Canny [4]. The Canny approach selects a
Gaussian function as a smoothing function θ; while
the wavelet-based approach chooses a wavelet
function to be θ0. Mallat, Hwang, and Zhong ( [5],
[6] ) proved that the maxima of the wavelet
transform modulus can detect the location of the
irregular structures. Further, a numerical procedure
to calculate their Lipschitz exponents has been
provided. One and two-dimensional signals can be
reconstructed, with a good approximation, from the
local maxima of their wavelet transform modulus.
The wavelet transform characterizes the local
regularity of signals by decomposing signals into
elementary building blocks that arewell localized
both in space and frequency. This not only explains
the underlying mechanism of classical edge
detectors, but also indicates a way of constructing
optimal edge detectors under specific working
conditions.
Results:
Multiscale edge detection
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0403-6
Wavelet filters of large scales are
more effective for removing noise, but at the
same time increase the uncertainty of the
location of edges. Wavelet filters of small
scales preserve the exact location of edges,
but cannot distinguish between noise and real
edges. We can use the coefficients of the
wavelet transform across scales to measure
the local Lipschitz regularity. That is, when the
scale increases, the coefficients of the wavelet
transformare likely to increase where the
Lipschitz regularity is positive, but they are
likely to decrease where the Lipschitz
regularity is negative. We know that locations
with lower Lipschitz regularities are more likely
to be details and noise. As scale increases,
the coefficients of the wavelet transform
increase for step edges, but decrease for Dirac
and fractal edges. So we can use a larger-
scale wavelet at positions where the wavelet
transform decreases rapidly across scales to
remove the effect of noise, while using a
smaller-scale wavelet at positions where the
wavelet transform decreases slowly across
scale to preserve the precise position of the
edges. Using the cascade algorithm in, we can
observe the change of the wavelet transform
coefficient between each adjacent scales, and
distinguish different kind of edges. Then we
can keep the scales small for locations with
positive Lipschitz regularity and increase the
scales for locations with negative Lipschitz
regularity. Fig. 1.5 shows that for a image
without noise, the result of our method is
similar to that of Canny’s edge detection. For
images with white noise in Fig. 1.6 – 1.10, our
method gives more continuous and precise
edges. Table 1 shows that the SNR of the
edges obtained by the multiscale wavelet
transform is significantly higher than others.
(a) (b) (c)
Fig. 1.5: Edge detection for Lena image: (a) The
Lena image; (b) Edges by the Canny edge detector;
(c) Edges by the multiscale edge detection using
wavelet transform
(a)
(b) (c)
(d) (e)
Fig. 1.6: Edge detection for a block image with
noise: (a) A block image (SNR=10db); (b) Edges by
the Sobel edge detector; (c) Edges by Canny edge
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0403-7
detection with default variance; (d) edges by Canny
edge detection with adjusted variance; (e) Edges by
the multiscale edge detection using wavelet
transform
(a) (b)
(c) (d)
Fig. 1.7: Edge detection for a Lena image with
noise: (a) Lena image (SNR=30db); (b) Edges by
the Sobel edge detector; (c) Edges by Canny edge
detection with adjusted variance; (d) Edges by
multi-level edge detection using wavelets
Table 1: False rate of the detected edges
(a) (b)
(c) (d)
Fig. 1.8: Edge detection for a bridge image with
noise: (a) Bridge image (SNR=30db); (b) Edges by
the Sobel edge detector; (c) Edges by Canny edge
detection with adjusted variance; (d) Edges by
multi-level edge detection using wavelet
(a) (b)
(c) (d)
Fig. 1.9: Edge detection for a pepper image with
noise: (a) Pepper image (SNR=10db); (b) Edges by
the Sobel edge detector; (c) Edges by Canny edge
detection with adjusted variance; (d) Edges by
multi-level edge detection using wavelet
(a) (b)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0403-8
(c) (d)
Fig. 1.10: Edge detection for a wheel image with
noise: (a) Wheel image (SNR=10db); (b) Edges by
the Sobel edge detector; (c) Edges by Canny edge
detection with adjusted variance; (d) Edges by
multi-level edge detection using wavelet
Conclusion & Future scope
In this work we have described an
approach for edge detection using wavelet
transform. The wavelet edge detector produces
better edges over classical edge detectors. Classical
edge detectors are very sensitive to noise. Since
wavelet decomposition involves low-pass filter, the
amount of the noise can be decreased in image
which in turn could lead to robust edge detection.
We can use the wavelet transformer to produce
initial images, then watershed algorithm can be
used for segmentation of the initial image, then by
using the inverse wavelet transform, the segmented
image can be projected up to a higher resolution.
REFERENCES
[1] J. C. Goswami, A. K. Chan, 1999,
“Fundamentals of wavelets: theory, algorithms, and
applications,” John Wiley & Sons, Inc.
[2] Y. Y. Tang, L. Yang, J. Liu, 2000, “Characterization
of Dirac-Structure Edges with Wavelet Transform,”
IEEE Trans. Sys. Man Cybernetics-Part B: Cybernetics,
vol.30, no.1, pp. 93-109.
[3] Mallat, S. 1987. “A compact multiresolution
representation: the wavelet model.” Proc. IEEE
Computer Society Workshop on Computer Vision, IEEE
Computer Society Press, Washington, D.C., p.2-7.
[4] J. Canny, 1986, “A computational approach to
edge detection,” IEEE Trans. Pattern Anal. Machine
Intell., vol. PAMI-8, pp. 679-698.
[5] S. Mallat, S. Zhong, 1992, “Characterization of
signals from multiscale edges,” IEEE Trans. Pattern
Anal. Machine Intell., vol.14, no.7, pp. 710-732.
[6]Acharyya, M., Kundu, M.K., 2001. Wavelet-based
texture segmentation of remotely sensed images. IEEE
11th Internat. Conf. Image Anal. Process, 69–74.
[7] Xiao, D., Ohya, J., “CONTRAST ENHANCEMENT
OF COLOR IMAGES BASED ON WAVELET
TRANSFORM AND HUMAN VISUAL SYSTEM”,
international conference GRAOPHICS AND
VISUALIZATIO IN ENGINEERING, Florida, USA,
2007.
[8] Scharcanski, J., Jung, C., R., Clarke, R. T., “Adaptive
Image Denoising Using Scale and Space Consistency”,
IEEE TRANSACTIONS ON IMAGE PROCESSING,
VOL. 11, NO. 9, SEPTEMBER 2002.
[9] Mallat S. A.: Theory for Multiresolution Signal
Decomposition: The Wavelet Representation. IEEE
Transactions on Pattern Analysis and Machine
Intelligence, 11(7), 674–693.
[10] Prieto M. S., Allen A. R.: A Similarity Metric for
Edge Images. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 25(10), 1265–1273.
[11] Drori I., Lischinski D.: Fast Multiresolution Image
Operations in the Wavelet Domain. IEEE Transactions
on Visualization and Computer Graphics, 9(3), 395–411.
[12] I. Drori D. Lischinski. Fast multiresolution image
operations in the wavelet domain. IEEE Transactions on
Visualization and Computer Graphics., 9(3):395–412,
2003.
[13] A. Cohen, R. D. Ryan, 1995, “ Wavelets
andMultiscale Signal Processing,” Chapman & Hall.
[14] J. J. Benedetto,M.W. Frazier, 1994, “Wavelets-
Mathematics and Applications,” CRC Press, Inc.
[15] R. J. Beattie, 1984, “Edge detection for semantically
based early visual processing,” dissertation, Univ.
Edinburgh, Edinburgh, U.K..
[16] B. K. P. Horn, 1971, “The Binford-Horn line-
finder,” Artificial Intell. Lab., Mass. Inst. Technol.,
Cambridge, AI Memo 285.
[17] L. Mero, 1975, “A simplified and fast version of the
Hueckel operator for finding optimal edges in pictures,”
Pric. IJCAI, pp. 650-655.
[18] R. Nevatia, 1977, “Evaluation of simplified Hueckel
edge-line detector,” Comput., Graph., Image Process.,
vol. 6, no. 6, pp. 582-588.
[19] Y. Y. Tang, L.H. Yang, L. Feng, 1998,
“Characterization and detection of edges by Lipschitz
exponent and MASW wavelet transform,” Proc. 14th Int.
Conf. Pattern Recognit., Brisbane, Australia, pp. 1572-
1574.
[20] K. A. Stevens, 1980, “Surface perception from local
analysis of texture and contour,” Artificial Intell. Lab.,
Mass. Instr. Technol., Cambridge, Tech. Rep. AI-TR-
512.
[21] K. R. Castleman, 1996, “Digital Image Processing,”
Englewood Cliffs, NJ: Prentice- Hall.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27
2011
SIP0403-9
[22] M. Hueckel, 1971, “An operator which locates
edges in digital pictures,” J. ACM, vol. 18, no. 1, pp.
113-125.
[23] Acharyya, M., Kundu, M.K., 2001. Wavelet-based
texture segmentation of remotely sensed images. IEEE
11th Internat. Conf. Image Anal. Process., 69–74.
[24] Jahromi O. S., Francis B. A., Kwong R. H.:
Algebraic theory of optimal filterbanks. Proceedings of
IEEE International Conference on Acoustics, Speech and
Signal Processing, 1, 113–116.
[25] A. Zygmund, 1968, “Trigonometric Series,” 2nd
ed., Cambridge: Cambridge Univ. Press.
[26] Mallat S.: Multifrequency channel decompositions
of images and wavelet models. IEEE Transaction in
Acoustic Speech and Signal Processing, 37, 2091–2110.
[27] R. M. Haralick, 1984, “Digital step edges from zero
crossing of second directional derivatives,” IEEE Trans.
Pattern Anal. Machine Intell., vol. PAMI-6, no. 1, pp.
58-68.
[28] E. C. Hildreth, 1980, “Implementation of a theory of
edge detection,” M.I.T. Artificial Intell. Lab., Cambridge,
MA, Tech. Rep. AI-TR-579.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0404-1
Color Image Enhancement by Scaling Luminance and Chromatic
Components
1Satyabrata Das,
2Sukanti Pal and
3A K Panda
National Institute of Science and Technology, Palur Hills, Berhampur, Odisha, 761008
Email: [email protected],
ABSTRACT
In this paper, a new technique for color image
enhancement using luminance and chromatic
component is presented. In the proposed
technique luminance and chromatic components
of color image are extracted separately and
converted to frequency domain. Then DC and
AC coefficients are scaled to preserve contrast,
brightness and color. While enhancing the
image care is taken to reduce the mathematical
computations. Processing the color image in
DCT domain invites unwanted side effect such
as blocking artifact which is minimized by using
smaller sub block matrix keeping in view the
complexity of mathematical computation.
Keywords Blocking artifacts, Chromatic, DCT, Luminance
1. INTRODUCTION
The display of a color image depends mainly on
brightness, contrast and colors. Enhancement of
the image is necessary to improve the visibility
of the image subjectively to remove unwanted
flickering, to improve contrast and to find more
details. In general there are two major
approaches [1]. They are spatial domain, where
statistics of grey values of the image are
manipulated and the second is frequency domain
approach; where spatial frequency contents of
the image are manipulated [1]. In spatial domain
histogram equalization, principal component
analysis, rank order filtering, homomorphic
filtering etc are generally used to enhance the
image. Although these techniques are developed
for gray valued images but few of them are also
applied to color image for enhancement
purpose. Mostly images are represented in
compressed format to save memory space and
bandwidth. So it is better if enhancement of the
image can be achieved in compressed domain
rather than transforming to spatial domain and
applying the enhancement technique and
transforming back to compressed domain;
thereby increasing the computational overhead.
Therefore, color images mostly uses JPEG
compression format for saving bandwidth and
memory space which uses popular discrete
cosine transform (DCT). Extracting the
luminance and chromatic components in DCT
domain and processing them to improve the
brightness, contrast and color invites unwanted
side effect such as blocking artifact. However,
these side effects can be minimized by using
special mathematical computation techniques.
In our work we have represented the color
image using Y-Cb-Cr color space so that we can
preserve both luminance and color component.
Previous works [2-4] have used the DCT
domain and they have implemented non uniform
scaling of DC and AC coefficients which
requires more mathematical computation. In our
approach we have adopted a uniform scale value
for both DC and AC components of Y, Cb and
Cr which substantially lowers computational
burden and at the same time enhancing the
image. DCT-II is presented in section 2. The
proposed algorithm is presented in section 3.
The results obtained are presented in section 4
and the paper is concluded in section 5.
2. MATHEMATICAL
PRELIMINARIES
There are eight different ways to do the even
extension of DFT and there are as many
definitions of the DCT [5,6]. We have used type
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0404-2
II DCT, which is widely, used in practice for
speech and image compression applications as
part of various standards [7]. Equation (1)
represents two dimensional DCT where C(k,l)
represents transformed DCT coefficients for the
input image x(m,n) assuming a square image of
size (N×N).
1,0
2
)12(cos
2
)12(cos),()()(),(
1
0
1
0
Nlk
N
ln
N
kmnmxlklkC
N
m
N
n
(1)
where
1,1
210
Nlkfor
Nlkand
N
Contrast of an image is defined using as change
in luminance with respect to surrounding to
luminance of surround. Hence contrast can be
thought of as the ratio between standard
deviation (σ) to mean (µ) value of the image.
The greater the value of standard deviation more
is the contrast.
3. THE PROPOSED ALGORITHM
Image in RGB format space is converted into Y-
Cb-Cr color space to find out luminance and
chromatic component individually. Then Y, Cb,
Cr component is split into (8×8) sub blocks
respectively. Then for each sub block DCT-II is
computed separately to obtain Y(u,v), Cb(u,v)
and Cr(u,v) respectively, where Y(u,v), Cb(u,v)
and Cr(u,v) represents the block transformed
DCT coefficients and the first element of each
DCT transformed coefficient Y(0,0), Cb(0,0)
and Cr(0,0) represents DC component and rest
are AC component. Each sub block after
computing its DCT coefficient is normalized by
a factor of 8. The proposed algorithm is
implemented in four steps. In first step
adjustment of local brightness is achieved. Local
brightness is adjusted by mapping the DC
coefficients of each sub block of Y(u,v) using a
monotonically increasing function ψ(x) [8]
which is shown in fig.1. While mapping the
coefficients, DC coefficient is treated separately
as compared to rest of the AC coefficients.
Mapping function for DC coefficient is
maxmax
80,0
Y
Y
Yy DCmapped (2)
where
0,;10
1,1
1
011
21
2
1
ppnmand
xmm
mxnn
mxm
xn
xp
p
ymax is the maximum brightness value of the
image before transforming using DCT. There
are various monotonic increasing functions
available in the literature [4] and [7]. No single
function is best suitable for all the images for
enhancement purpose. We choose ψ(x) as its
value can be modified using four parameters
such as m, n, p1, p2. We varied the values for m,
n, p1, p2 and choose m = n = 0.5 and p1=1.8 and
p2= 0.8 for best performance. As Y component
represents the luminance component hence only
this component is mapped to alter its brightness
leaving behind the Cb and Cr component
unaltered. In the second step adjustment of local
contrast is achieved by scaling the DC and AC
coefficients of normalized Y(u,v), Cb(u,v) and
Cr(u,v). The scale factor „s‟ is defined as the
ratio between mapped DC coefficient for each
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0404-3
normalized sub block (8×8) of Y(u,v) to the
original DC coefficient. As DC component
gives the information about mean of brightness
distribution of each sub block hence it is used to
compute the scale factor„s‟. Assuming 8 bit
representation while scaling overflow of gray
values may occur beyond 255 which is taken
care by limiting the scale factor depending upon
the image. In the third step preservation of color
is achieved through scaling of normalized
Cb(u,v) and Cr(u,v) component through the
same scale factor „s‟ corresponding to each
normalized sub block of Y(u,v). Since the
mapping from RGB to Y-Cb-Cr is non linear
and Cb, Cr depends on Y hence while scaling
the color component DC coefficients has to be
treated separately.
Similarly for normalized Cr(u,v) is to be scaled
using the above mentioned procedure. Finally
blocking artifacts are suppressed. As this
algorithm is developed around type–II DCT
hence blocking artifacts are visible in the
processed image because of discontinuities in
gray values. There are several methods available
to minimize the blocking artifacts but they are
computationally exhaustive. We have proposed
a simple method to minimize blocking artifacts
and at the same time it requires less
computation. For this purpose standard
deviation (σ) is computed for each normalized
sub block of Y(u,v). When (σ) represents a large
value then it is concluded that corresponding
sub block contains a large variation of gray
values which results in blocking artifacts. If
threshold where
threshold represents threshold
value of standard deviation which is image
dependent and to be decided based on the
amount of blocking artifact removal. Each
normalized 8×8 sub block of Y(u,v), Cb(u,v)
and Cr(u,v) are subdivided into four 4×4 sub
blocks. and the scale factor „s‟ is recomputed
through the earlier mentioned steps of this
algorithm. Only those sub blocks will be scaled
where threshold
condition is met leaving
behind the remaining sub blocks unaltered.
Then corresponding sub blocks of Y, Cb and Cr
is scaled through the new scale factor in order to
remove the artifacts. Finally image is
reconstructed in spatial domain by combing Y,
Cb and Cr components.
4. QUALITY ASSESMENT
Simulation is performed on various images
using MATLAB. As the proposed algorithm is
based on DCT so for assessing quality PSNR
and SNR is not a suitable option as prior
information regarding the type of distortion is
not available with us. We have used no-
reference perceptual quality assessment for
JPEG compressed images [9] where quality
metric that incorporates human visual system
characteristics which do not require the input
image for computing the quality. Based upon
this a quality score is obtained which reflects the
amount of blocking artifact removal and
distortion removal due to non linear mapping. If
the quality score is nearer to 10 it reflects the
best quality image and 1 represents worst
quality image. Wang et al. [9] suggested no
reference quality metric for computing the
quality of JPEG image. The computation of this
metric is described in [9] where they have cited
the website which contains the MATLAB code
for computing the quality score. We have used
the same MATLAB code for evaluation of
quality and called as quality score. Quality score
obtained for different images is tabulated in
table 1.
Table 1. Quality Score
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0404-4
Before
artifact
removal
After
artifact
removal
Image_1 7.7274 8.3612
Image_2 6.177 8.36
Image_3 8.3903 9.3895
Image_4 8.6128 8.9288
Figure 2.a represents the original image for
Image_1. For Image_1 four stages of output
obtained; they are (i) after scaling the DC
coefficient of Y (fig 2.b), (ii) after scaling both
DC and AC coefficients of Y (fig 2.c),(iii)
scaling (Y, Cb and Cr) components before
blocking artifact removal (fig 2.d) and (iv) after
blocking artifact removal (fig 2.e) respectively.
For Image_2,3,4 outputs before and after
blocking artifact removal are shown in same
way in figure 3, 4 and 5. Quality factor is
computed for different images and is shown in
table 1. From table, it is observed that quality
factor is improved after removing the blocking
artifact. Table shows the quality factor is nearer
to ten showing the better enhancement of color
image.
(a) (b)
(c) (d)
(e)
Fig 2 (a) Image_1 (b) enhanced image by
scaling DC coefficient only (c) enhanced image
by scaling both DC and AC coefficient (d)
enhanced image by scaling all components
including Cb and Cr (e) enhanced image with
blocking artifacts removal.
Fig 1: Plot of mapping function ψ(x)
(a) (b) (c)
Fig 3.(a) Image_2 (b) enhanced image by
scaling all components including Cb and Cr (c)
enhanced image with blocking artifacts removal.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0404-5
(a) (b)
(c)
Fig 4.(a) Image_3 (b) enhanced image by
scaling all components including Cb and Cr (c)
enhanced image with blocking artifacts removal
(a) (b)
(c)
Fig 5.(a)Image_4 (b) enhanced image by scaling
all components including Cb and Cr (c)
enhanced image with blocking artifacts removal
CONCLUSION
In this paper, we have presented a simple
method for enhancing the color image in
compressed format by scaling luminance and
chromatic components using less computational
overhead. Quality score is computed which
proves the performance of proposed method.
The proposed algorithm can be implemented on
any image processing hardware.
ACKNOWLEDGMENT
The authors acknowledge the DST-TIFAC
CORE on “3G/4G Communication
Technologies” received by National Institute of
Science and Technology from Department of
Science & Technology (DST), Government of
India.
REFERENCES
[1] Gonzalez, Rafael C. and Woods, Richard E.
Digital Image Processing, Pearson, Prentice
Hall, Third edition, 2008.
[2] Aghagolzadeh, S. and Erosy, O. K.
“Transform image enhancement,” Opt. Eng.,
vol.31, pp.614-626, Mar.1992.
[3] Tang, J., Peli, E., and Acton, S. “Image
enhancement using a contrast measure in the
compressed domain,” IEEE Signal Process.
Lett. Vol.10, pp.289-292, Oct. 2003.
[4] Lee, S. “An efficient content – based image
enhancement in the compressed domain
using retinex theory,” IEEE Trans. Circuits
Syst. Video Technol., vol. 17,no. 2, pp. 199-
213, feb.2007.
[5] Wang, Z. “Fast algorithms for the discrete w
transform for the discrete fourier transform,”
IEEE Trans. On ASSP, vol. 32. No. 4. pp.
803-816, Aug. 1984.
[6] Martucci, S.A. “Symmetric convolution and
the discrete sine and cosine transforms.”
IEEE Trans. On signal Processing, vol.42,
no. 5, pp.1038-4051, May. 1994.
[7] Rao, K. and Huang, J. “Techniques and
standards for image, video, and audio
coding,” Prentice Hall, Upper Saddle River,
NJ. 1996.
[8] De, T.K. “A simple programmable S-
function for digital image processing,” in
Proc. 4th
IEEE Region 10th
Int. Conf.,
Bombay, India, pp. 573-576.Nov.1989.
[9] Wang, Z., Sheikh, H.R. and Bovik, A.C.
“No-reference perceptual quality assessment
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0404-6
of JPEG compressed images,” in Proc. Int.
Conf. Image Processing, Rochester, NY,
vol. 1. pp. 477-480, Sep. 2002.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0405-1
A Tutorial on Image Compression Techniques
1Vedvrat,
2Krishna Raj
1Department of Electronics & Communication Engineering A.I.T, Kanpur, U.P., India
2Department of Electronics Engineering H.B.T.I., Kanpur, U.P., India
(Email: [email protected], [email protected])
Abstract—Processing of multimedia data acquires large
transmission bandwidth and storage capacity. Reduction in
these parameters introduces the concept of data compression.
For achieving the better compression without degrading the
image quality, data compression techniques become the
challenge for the researchers. Numerous image coding
techniques i.e. subband coding, EZW, SPIHT, EBCOT,
wavelet transform coding have been presented. In this paper
performance comparison of these coding techniques is
presented.
Keywords—Wavelet transform, EBCOT, SPIHT, EZW,
subband coding, JPEG
I. INTRODUCTION
Uncompressed multimedia (audio and video) data
requires considerable storage capacity and transmission
bandwidth. Despite rapid progress in mass-storage density,
processor speeds, and digital communication system
performance, demand for data storage capacity and data-
transmission bandwidth continues to outstrip the
capabilities of available technologies. The recent growth of
data intensive multimedia-based web applications have not
only sustained the need for more efficient ways to encode
signals and images but have made compression of such
signals central to storage and communication technology.
For still image compression, the `Joint Photographic
Experts Group' or JPEG standard has been established by
ISO (International Standards Organization) and IEC
(International Electro-Technical Commission). The
performance of these coders generally degrades at low bit-
rates mainly because of the underlying block-based
Discrete Cosine Transform (DCT) scheme. More recently,
the wavelet transform has emerged as a cutting edge
technology, within the field of image compression.
Wavelet-based coding provides substantial improvements
in picture quality at higher compression ratios. The large
storage space, large transmission bandwidth, and long
transmission time is required for image, audio, and video
data. At the present state of technology, the only solution
is to compress multimedia data before its storage and
transmission, and decompress it at the receiver for play
back.
II. COMPRESSION PRINCIPLE
Existing correlation in neighboring pixels causes the
redundant information in images. So less correlated
representation of image required. Two fundamental
components of compression are redundancy and
irrelevancy reduction. Redundancy reduction aims at
removing duplication from the signal source
(image/video). Irrelevancy reduction omits parts of the
signal that will not be noticed by the signal receiver,
namely the Human Visual System (HVS). In general, three
types of redundancy can be identified. Image compression
research aims at reducing the number of bits needed to
represent an image by removing the spatial and spectral
redundancies as much as possible.
a. Spatial Redundancy; correlation between
neighboring pixel values.
b. Spectral Redundancy; correlation between
different color planes or spectral bands.
c. Temporal Redundancy; correlation between
adjacent frames in a sequence of images (in video
applications).
In lossless compression schemes, the reconstructed
image, after compression, is numerically identical to the
original image. An image reconstructed following lossy
compression contains degradation relative to the original.
Often this is because the compression scheme completely
discards redundant information. However, lossy schemes
are capable of achieving much higher compression. Under
normal viewing conditions, no visible loss is perceived. In
predictive coding, information already sent or available is
used to predict future values, and the difference is coded.
Since this is done in the image or spatial domain, it is
relatively simple to implement and is readily adapted to
local image characteristics. Transform coding, on the other
hand, first transforms the image from its spatial domain
representation to a different type of representation using
some well-known transform and then codes the
transformed values. This method provides greater data
compression compared to predictive methods, although at
the expense of greater computation.
III. COMPRESSION TECHNIQUES
a. Subband Coding
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0405-2
In subband coding [4], an image is decomposed into
asset of band-limited components, called subbands, which
can be resembled to reconstruct the original image without
error. Each subband is generated by band pass filtering the
input. Since the bandwidth of the resulting subbands is
smaller than that of the original image, the subbands can
be downsampled without loss of information.
Reconstruction of the original image is accomplished by
upsampling, filtering, and summing the individual
subbands. Fig.1 shows the principal components of a two-
band subband coding and decoding system. The input of
the system is a 1-D, band-limited discrete-time signal x(n)
for n= 0,1,2....; the output sequence x‟(n) is formed
through the decomposition of x(n) into y0(n) and y1(n) via
analysis filters g0(n) and g1(n). Filter h0(n) is a low pass
filter whose output is an approximation of x(n); filter h1(n)
is a high pass filter whose output is high frequency or
detail part of x(n). All the filters Are selected in such a
way so that the input can be reconstructed perfectly such
that x‟(n) = x(n).
Fig.1 Components of Subband coding
Woods and O'Neil used a separable combination of
one-dimensional Quadrature Mirror Filter banks (QMF) to
perform 4-band decomposition by the row-column
approach as shown in fig.2. The process can be iterated to
obtain higher band decomposition filter trees. At the
decoder, the subband signals are decoded, upsampled and
passed through a bank of synthesis filters and properly
summed up to yield the reconstructed image.
Fig.2 4-band decomposition by row-column approach
b. Short Time Fourier Transform
The Fourier Transform separates the waveform into a
sum of sinusoids of different frequencies and identifies
their respective amplitudes. Thus it gives us a frequency-
amplitude representation of signal. In STFT [6], non-
stationary signal is divided into small portions, which are
assumed to be stationary. This is done using a window
function of chosen width, which is shifted and multiplied
with the signal to obtain the small stationary signals. The
Fourier Transform is then applied to each of these portions
to obtain the STFT of the signal. The problem with STFT
goes back to the Heisenberg uncertainty principle which
states that it is impossible for one to obtain which
frequencies exist at which time instance, but, one can
obtain the frequency bands existing in a time interval. This
gives rise to the resolution issue where there is a trade-off
between the time resolution and frequency resolution. To
assume stationarity, the window is supposed to be narrow,
which results in a poor frequency resolution, i.e., it is
difficult to know the exact frequency components that
exist in the signal; only the band of frequencies that exist is
obtained. If the width of the window is increased,
frequency resolution improves but time resolution
becomes poor, i.e., it is difficult to know what frequencies
occur at which time intervals. Once the window function
is decided, the frequency and time resolutions are fixed for
all frequencies and all times.
c. Wavelet Transform
In contrast to STFT, which uses a single analysis
window, the Wavelet Transform [5] uses short windows at
high frequencies and long windows at low frequencies.
This results in multi-resolution analysis by which the
signal is analyzed with different resolutions at different
frequencies, i.e., both frequency resolution and time
resolution vary in the time-frequency plane without
violating the Heisenberg inequality. In Wavelet Transform,
as frequency increases, the time resolution increases;
likewise, as frequency decreases, the frequency resolution
increases. Thus, a certain high frequency component can
be located more accurately in time than a low frequency
component and a low frequency component can be located
more accurately in frequency compared to a high
frequency component.
Wavelet transform analyzes non-stationary signals as both
frequency and time information is needed.
Wavelets are functions defined over a finite interval
and having an average value of zero. The basic idea of the
wavelet transform is to represent any arbitrary function
x(t) as a superposition of a set of such wavelets or basis
functions. These basis functions are obtained from a single
prototype wavelet called the mother wavelet, by dilations
or contractions (scaling) and translations (shifts). The
Discrete Wavelet Transform of a finite length signal x(n)
ho(n)
h1(n)
2
2
2
2
g1(n)
go(n)
x(n) x‟(n)
Col L
Col L
Col H
Col H
Row
H
Row
L
input
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0405-3
having N components, for example, is expressed by an N x
N matrix. The generic form for a1-D wavelet transform is shown
in Fig.3. Here a signal is passed through a low pass and
high pass filter, h and g, respectively, then downsampled
by a factor of 2, constituting one level of transform.
Multiple levels or scales of the wavelet transform are made
by repeating the filtering and decimation on low pass
branch outputs only. The process is typically carried out
for a finite number of levels K, and the resulting
coefficients, di1 (n), i {1,....K} and dk0(n), and are called
wavelet coefficients.
Fig.3. Generic form of 1-D wavelet transforms
The 1-D wavelet transform can be extended to a 2-D
wavelet transform using separable wavelet filters. With
separable filters the 2-D transform can be computed by
applying a 1-D transform to all the rows of input, and then
repeating on all of the columns. Fig.4 shows an example of
three-level (k=3) 2-D wavelet expansion, where k
represents the highest level of the decomposition of the
wavelet transform.
Fig.4 Three-level 2-D wavelet expansion
d. Embedded Zero tree Wavelet (EZW) Compression
In octave-band wavelet decomposition each
coefficient in the high-pass bands of the wavelet transform
has four coefficients corresponding to its spatial position in
the octave band above in frequency. Because of this very
structure of the decomposition, encoding of coefficients
required to achieve better compression results. Lewis and
Knowles [5] in 1992 were the first to introduce a tree-like
data structure to represent the coefficients of the octave
decomposition. Later, in 1993 Shapiro [2] called this
structure zero tree of wavelet coefficients, and presented
his elegant algorithm for entropy encoding called
Embedded Zero tree Wavelet (EZW) algorithm. EZW
algorithm contains the following features
1. A discrete wavelet transforms which provides a
compact multiresolution representation of the
image.
2. Zero tree coding which provides a compact
multiresolution representation of significance
maps, which indicates the position of significant
coefficients. Zero trees allow the successful
prediction of insignificant coefficients across
scales to be efficiently represented as a part of
growing trees.
3. Successive Approximation which provides a
compact multiprecision representation of the
significant coefficients and facilitates the
embedding algorithm.
4. Adaptive multilevel arithmetic coding which
provides a fast and efficient method for entropy
coding string of symbols, and requires no pre-
stored tables.
5. The algorithm runs sequentially and stops
whenever a target bit rate is met.
A significant map defined as an indication of whether
a particular coefficient was zero or nonzero (i.e.,
significant) relative to a given quantization level. The
EZW algorithm [2] determined a very efficient way to
code significance maps not by coding the location of the
significant coefficients, but rather by coding the location of
the zeros. It was found experimentally that zeros could be
predicted very accurately across different scales in the
wavelet transform. Defining a wavelet coefficient as
insignificant with respect to a threshold T if |x | < T, the
EZW algorithm hypothesized that “if a wavelet coefficient
at a coarse scale is insignificant with respect to a given
threshold T, then all wavelet coefficients of the same
orientation in the same spatial location at finer scales are
likely to be insignificant with respect to T.” Recognizing
that coefficients of the same spatial location and frequency
orientation in the wavelet decomposition can be compactly
described using tree structures, the EZW called the set of
insignificant coefficients, or coefficients that are quantized
to zero using threshold T, zero-trees.
LL2 HL2
LH2 HH
2
HL1
LH1 HH
1
h
g
2
2
h
g
2
2
d10(n)
d11(n)
dk0(n)
dk1(n)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0405-4
Fig.5 Tree structure of wavelet transform
Consider the tree structures on the wavelet transform
shown in Fig.5. In the wavelet decomposition, coefficients
that are spatially related across scale can be compactly
described using these tree structures. With the exception of
the low resolution approximation (LL1) and the highest
frequency bands (HL1, LH1, and HH1) each parent
coefficient at level i of the decomposition spatially
correlates to 4 (child) coefficients at level i -1of the
decomposition which are at the same frequency
orientation. For the LLk band, each parent coefficient
spatially correlates with 3 child coefficients, one each in
the HLk, LHk, and HHk bands. The standard definitions of
ancestors and descendants in the tree follow directly from
these parent- child relationships. A coefficient is part of a
zero-tree if it is zero and if all of its descendants are zero
with respect to the threshold T. It is also a zero-tree root if
is not part of another zero-tree starting at a coarser scale.
Zero-trees are very efficient for coding since by declaring
only one coefficient a zero-tree root, a large number of
descendant coefficients are automatically known to be
zero. The compact representation, coupled with the fact
that zero-trees occur frequently, especially at low bit rates,
make zero-trees efficient for coding position information.
EZW implements successive approximation
quantization through a multipass scanning of the wavelet
coefficients using successively decreasing thresholdsT0,
T1,T2 ,.... . The initial threshold is set to the value of T0 =
2[log
2 x
max], where xmax is the largest wavelet coefficient.
Each scan of wavelet coefficients is divided into two
passes: dominant and subordinate. The dominant pass
establishes a significance map of the coefficients relative
to the current threshold Ti. Thus, coefficients which are
significant on the first dominant pass are known to lie in
the interval [T0 ,2T0 ) , and can be represented with the
reconstruction value of (3T0/2). The dominant pass
essentially establishes the most significant bit of binary
representation of the wavelet coefficient, with the binary
weights being relative to the thresholds Ti.
e. Set Partitioning in Hierarchical Trees (SPIHT)
Said and Pearlman [3], offered an alternative
explanation of the principles of operation of the EZW
algorithm to better understand the reasons for its excellent
performance. According to them, partial ordering by
magnitude of the transformed coefficients with a set
partitioning sorting algorithm, ordered bit plane
transmission of refinement bits, and exploitation of self-
similarity of the image wavelet transform across different
scales of an image are the three key concepts in EZW. In
addition, they offer a new and more effective
implementation of the modified EZW algorithm based on
set partitioning in hierarchical trees, and call it the SPIHT
algorithm. They also present a scheme for progressive
transmission of the coefficient values that incorporates the
concepts of ordering the coefficients by magnitude and
transmitting the most significant bits first. SPIHT uses a
uniform scalar quantizer and claim that the ordering
information made this simple quantization method more
efficient than expected. An efficient way to code the
ordering information is also proposed. Results from the
SPIHT coding algorithm in most cases surpass those
obtained from EZQ algorithm.
f. Scalable Image Compression with EBCOT
This algorithm is based on independent Embedded
Block Coding with Optimized Truncation of the embedded
bit-streams (EBCOT). EBCOT algorithm [1] uses a
wavelet transform to generate the subband coefficients
which are then quantized and coded. Although the usual
dyadic wavelet decomposition is typical, other "packet"
decompositions are also supported and occasionally
preferable. Scalable compression refers to the generation
of a bit-stream which contains embedded subsets, each of
which represents an efficient compression of the original
image at a reduced resolution or increased distortion. A
key advantage of scalable compression is that the target
bit-rate or reconstruction resolution need not be known at
the time of compression. Another advantage of practical
significance is that the image need not be compressed
multiple times in order to achieve a target bit-rate, as is
common with the existing JPEG compression standard.
Rather than focusing on generating a single scalable bit-
stream to represent the entire image, EBCOT partitions
each subband into relatively small blocks of samples and
generates a separate highly scalable bit-stream to represent
each so-called code-block. The algorithm exhibits state-of-
the-art compression performance while producing a bit-
stream with an unprecedented feature set, including
resolution and SNR scalability together with a random
access property. The algorithm has modest complexity and
is extremely well suited to applications involving remote
browsing of large compressed images.
IV. PERFORMANCE COMPARISION
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0405-5
Fig.6 (a) PSNR results for LENA
Fig.6 (b) PSNR results for BARBARA
V. CONCLUSION
A number of coding techniques have been proposed
since the introduction of the EZW algorithm. A common
characteristic of these techniques is that they use the basic
ideas found in the EZW algorithm. The wavelet coders are
much closer to the EZW algorithm than to the subband
coding. SPIHT became very popular since it was able to
achieve equal or better performance than EZW without
having to use an arithmetic encoder. The reduction in
complexity from eliminating the arithmetic encoder is
significant. Another technique, EBCOT algorithm, has
been chosen as the basis of the JPEG 2000 standard. The
performance comparison of these techniques has been
discussed in the previous section. By comparing the EZW,
subband coding and other techniques, because of the
multiresolution property and its performance of the lossy
wavelet image coding technique have matured
significantly and provides a very strong basis for the new
JPEG 2000 coding standard.
VI. REFRENCES
[1] Taubman, D. „High Performance Scalable Image
Compression with EBCOT‟, IEEE Tran. IP, Mar. 1999
[2] Shapiro, J. M. „Embedded Image Coding Using Zerotrees of
Wavelet Coefficients‟, IEEE Trans. SP, vol. 41, no. 12, Dec.
1993, pp. 3445-3462.
[3] Said, A. and Pearlman, W. A. „A New, Fast and Efficient
Image Codec Based on Set Partitioning in Hierarchical
Trees‟, IEEE Trans. CSVT, vol. 6, no. 3, June 1996, pp. 243-
250,
[4] Woods, J. W. and O'Neil, S. D. „Subband Coding of Images‟
IEEE Trans. ASSP, vol. 34, no. 5, October 1986, pp. 1278-
128
[5] Lewis, A. S. and Knowles, G. „Image Compression Using
the 2-D Wavelet Transform‟, IEEE Trans. IP, vol. 1, no. 2,
April 1992, pp. 244-250.
[6] Gonzalez, R.C. and Woods, R.E., Digital Image Processing,
2nd edition, Pearson Education, 2004, pp. 409 – 510.
0
5
10
15
20
25
30
35
40
45
0.0625 0.125 0.25 0.5 1
PSNR (Lena)
SC
WT
EZW
SPIHT
EBCOT
0
5
10
15
20
25
30
35
40
0.0625 0.125 0.25 0.5 1
PSNR (Barbara)
EZW
SPIHT
EBCOT
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0406-1
Comparative Study of Lifting –based
Discrete Wavelet transform Architectures
Vidyadhar Gupta, Krishna Raj
Department of electronics engineering
Harcourt Butler Technological Institute, Kanpur
Abstract. In this paper, we provide comparative
study of different existing architecture for efficient
implementation of lifting based Discrete wavelet
Transform(DWT).The basic principal behind the
lifting based scheme is to decompose the finite
impulse response filters in wavelet transform into
a finite sequence of simple filtering steps.
Keywords Architecture, Discrete wavelet
transform, lifting
1. Introduction The Discrete Wavelet Transform
(DWT) has become a very versatile signal
processing tool over the last decade. It has been
effectively used in signal and image processing
application ever since Mallat [4] proposed the
multiresoluation representation of signals based on
wavelet decomposition. In fact lifting based DWT
is the basis of the new JPEG2000 image
compression standard which has been shown to
have superior performance compared to the current
JPEG standard [5]. The main feature of the lifting-
based DWT scheme is to break up the high-pass
and low-pass wavelet filters into a sequence of
upper and lower triangular matrices, and convert
the filter implementation into banded matrix
multiplications [6] .The popularity of lifting-based
DWT has triggered the development of several
architectures in recent years. These architectures
range from highly parallel architectures to
programmable, DSP-based architectures to folded
architectures. In this paper we present comparative
study of these architectures. We provide systematic
derivation of these architectures and compared on
the basis of hardware utilization and critical path
latency. The rest of the paper is organized as
follows. In Section2, we briefly explain
mathematical formulation and principles behind the
lifting scheme. In section 3, we present a number of
one dimensional lifting -based DWT architectures.
Specifically, we describe direct mapping of the
data dependency diagram of the lifting scheme in a
pipelined architecture, folded architecture, MAC
based programmable architecture, flipping
architecture. In section 4, we present a comparison
the hardware and critical path latency of all the
architecture. We conclude this paper in section 5.
architecture.
2. DWT and Lifting implementation
In traditional convolution (filtering) based
approach for computation of the forward DWT, the
input signal (s) is filtered separately by a low-pass
filter ( ) and a high-pass filter ( ). The two output
streams are then sub-sampled by simply dropping
the alternate output samples in each stream to
produce the low-pass ( ) and high-pass ( ) sub-
band outputs as shown in Fig1
The two filters ( ) form the analysis filter bank.
The original signal can be reconstructed by a
synthesis filter bank (h, g) starting from and
as shown in Fig1 Given a discrete signal s(n), the
output signals (n)and (n) in Fig1can be
computed as:
(n) = (i) s(2n-1),
(n) = (i) s(2n-1) (1)
where and are the length of the low-pass filter
( ) and a high-pass filter ( ) respectively. During
the inverse transform computation, both and
are first up-sampled by inserting zeros in between
two samples and then filtered by low-pass (h) and
high-pass (g) filters respectively. Then they are
added together to obtain the reconstructed signal
(s’) as shown in Fig1.
Fig1.Signal analysis and reconstruction in 1D
DWT.
g
↑2
↑2
↓2
↓2
s
s’
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0406-2
For multiresolution wavelet decomposition the low
pass sub-band ( ) is further decomposed in a
similar fashion in order to get the second-level of
decomposition, and the process repeated. The
inverse process follows similar multi-level
synthesis filtering in order to reconstruct the signal.
Since two dimensional wavelet filters are separable
functions, 2D DWT can be obtained by first
applying the 1D DWT row-wise (to produce L and
H sub-bands in each row) and then column-wise.
For the filter bank in fig1, the condition for perfect
reconstruction of a signal [3] are given by
( ) + ( ) =2
(2)
( ) + =0
Where is the Z-transform of the FIR Filter .
can be expressed as a Laurent polynomial of degree
p as
=
This can also be expressed using a polyphase
representation as
= ( ) + ( ) (3)
where contains the even coefficients and
contains the odd coefficients of the FIR filter h.
Similarly,
= ( ) + ( ),
= + , (4)
= ( ) +
Based on the above formulation, we can define the
polyphase matrices as
= ,
(5)
=
Often is called the dual of and for
perfect reconstruction, they are related as
=I, where I is the identity matrix.
Now the wavelet transform in terms of the
polyhpase matrix can be expressed as
=
=
For the forward DWT and inverse DWT
respectively. If the determinant of is unity, it
can be shown by applying Cramer’s rule that
= ,
= and hence
= ,
= .
When the determinant of is unity, the
synthesis filter pair ( ) and the analysis filter
pair ( ) are both complementary. When ( )
= ), the wavelet transform is called orthogonal,
otherwise it is biorthogonal. We can apply
Euclidean algorithm to factorize (z) into a finite
sequence of alternating upper and lower triangular
matrices as follows;
(z) =
where K is a constant and act as a scaling factor (so
is ), and (for 1 ≤ i ≤ m) are Laurent
polynomials of lower orders. Computation of the
upper triangular matrix is known as primal lifting
and this is referred to in the literature as lifting the
low-pass sub band with the help of the high-pass
subband .Similarly, computation of the lower
triangular matrix is called dual lifting, which is
lifting the high-pass sub band with the help of the
low-pass sub band Often these two basic lifting
steps are called update and predict as well. The
dual polyphase factorization which also consists of
predict and update steps can be represented in the
following form:
=
Hence the lifting based forward wavelet transform
essentially is to first apply the lazy wavelet on the
input stream (split into even and odd samples), then
alternately execute primal and dual lifting steps,
and finally scale the two output streams by and K
respectively, to produce low pass and high-pass sub
bands, as shown in fig 2(a).
(a) Forward transformation.
(b ) Inverse transforms.
Fig 2 Lifting based DWT&IDWT.
K
Split
merge
K
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0406-3
The inverse DWT can be derived by traversing
above steps in the reverse direction, first scaling the
low-pass and high-pass sub band inputs by K and
1/K respectively, and then applying the dual and
primal lifting steps after reversing the signs of
coefficients in and and finally the inverse
lazy transform by up-scaling the output before
merging them into a single reconstructed stream as
shown in Fig.2 (b)
3. Lifting Architecture for 1D DWT
The data dependencies in the lifting scheme can be
explained with the help of an example of DWT
filtering with four factors (or four lifting steps).
The four lifting steps correspond to four stages as
shown in Fig. 3. The intermediate results generated
in the first two stages for the first two lifting steps
are subsequently processed to produce the high-
pass (HP) outputs in the third stage, followed by
the low-pass (LP) outputs in the fourth stage. (9, 7)
filter is an example of a filter that requires four
lifting steps. For the DWT filters requiring only
two factors, such as the (5, 3) filter, the
intermediate two stages can simply be bypassed
3.1 Direct Mapped Architecture
A direct mapping of the data dependency diagram
into a pipelined architecture was proposed by Liu
et al. in [7] and described in Fig .4 the architecture
is designed with 8 adders (A1–A8), 4 multipliers
(M1–M4), 6 delay elements (D) and 8 pipeline
registers (R). There are two input lines to the
architecture: one that inputs even samples (s2i) and
the other one that inputs odd samples (s2i+1). There
are four pipeline stages in the architecture. In the
first pipeline stage, adder A1computes s2i + s2i+1and
adder A2 computes α (s2i+s2i-2)+s2i-1 The output of
A2 corresponds to the intermediate results
generated in the first stage of Fig3. The output of
adder A4 in the second pipeline stage corresponds
to the intermediate results generated in the second
stage of Fig.3. Continuing in this fashion, adder A6
in the third pipeline stage produces the high-pass
output samples, and adder A8 in the fourth pipeline
stage produces the low-pass output samples. For
lifting schemes that require only 2 lifting steps,
such as the(5,3) filter, the last two pipeline stages
need to be bypassed causing the hardware
utilization to be only 50% or less. Also, for a
single read port memory, the odd and even samples
are read serially in alternate clock cycles and
buffered. This slows down the overall pipelined
architecture by 50% as well.
3.2 Folded Architecture
The pipelined architecture in Fig.4 can be further
improved by carefully folding the last two pipeline
stages into the first two stages as shown in Fig.5.
The architecture proposed by Lian, et al. in [2]
consists of two pipeline stages, with three pipeline
registers, R1, R2 and R3. In the (9, 7) type filtering
operation, intermediate data (R3) generated after
the first two lifting steps (Phase 1) are folded back
to R1 (as shown in Fig.5) for computation of the
last two lifting steps (phase 2). The architecture can
be reconfigured so that computation of two phases
can be interleaved by selection of appropriate data
by the multiplexors. As a result, two delay registers
(D) are needed in each lifting step in order to
properly schedule the data in each phase. Based on
the phase of interleaved computation, the
coefficient for multiplier M1 is either α or γ, and
similarly the coefficient for multiplier M2 is β or δ
.The hardware utilization of this architecture is
always 100%. Note that for the (5, 3) type filter
operation, folding is not required.
3.3 MAC Based Programmable Architecture [3]
A programmable architecture that implements the
data dependencies represented in Fig.3 using four
MACs (Multiply and Accumulate) and nine
registers has been proposed by Chang et al. in [3].
The algorithm is executed in two phases as shown
in Fig. 6 The data-flow of the proposed architecture
can be explained in terms of the register allocation
of the nodes. The computation and allocation of the
registers in phase 1 are done in the following order.
R0 s2i-1 ; R2 s2i
R3 R0 + α (R1+R2);
R4 R1 +β (R5+R3);
R8 R5 + γ (R6+R4);
Output LP R6+δ (R7+R8);
Output HP R8
Similarly, the computation and register allocation
in phase 2 are done in the following order.
R0 s2i+1; R1 s2i+2;
R5 R0+ α (R2+R1);
R6 R2 + β (R3+R5);
Output LP R4 +γ (R8+R7);
Output HP R7
As a result, two samples are input per phase and
two samples (LP and HP) are output at the end of
every phase. For 2D DWT implementation, the
output samples are also stored into a temporary
buffer for filtering in the vertical dimension.
3.4 Flipping Architecture [1]
While conventional lifting-based architectures
require fewer arithmetic operations, they
sometimes have long critical paths. For instance,
the critical path of the lifting-based architecture for
the (9, 7) filter is 4Tm + 8Ta while that of the
convolution implementation is Tm + 4Ta.
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0406-4
s0 s1 s2 s3 s4 s5 s6 s7 s8
α α α α α α α α
β β β β β β β β
γ γ γ γ γ γ γ γ 1/K
δ δ δ δ δ δ δ δ
LP
Fig.3 Data dependency diagram for lifting of filters with four factors
LP
s2i
s2i-2+s2
β δ
α γ
HP
s2i+1
s2i-1 α (s2i+s2i-2)+s2i-1
Fig.4 The direct mapped architecture [7]
Input
β, δ
K
1/K
α, γ
Fig. 5 The folded architecture in [2]
D
A1
D
R4
R4
R3
R3
A7
M4
A8 R1
D
D
D
M1
A3
R1 R2
R2
M2
A4
A2 A6
M3
A5
D
A3
A4
M2
M3
M4
R3 R
R
R2 R3 A2 Odd R1
D D
R1
A1
M1
D D Even
R2
First stage
Second stage
HP output
LP output
HP
K
input
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0406-5
One way of improving this is by pipelining which
results in a significant increase in the number of
registers. For instance, to pipeline the lifting-based
(9,7) filter such that the critical path is Tm + 2Ta, 6
additional registers are required. C.T. Huang,[1]
proposed a very efficient way of solving the timing
accumulation problem The basic idea is to remove
the multiplications along the critical path by scaling
the remaining paths by the inverse of the multiplier
coefficients. Fig.7 (a)–(b) describes how scaling at
each level can reduce the multiplications in the
critical path. . The critical path is now Tm + 5Ta.
The minimum critical path of Tm can be achieved
by 5 pipelining stages using 11 pipelining registers
(not shown in the figure). Detailed hardware
analysis of lossy (9, 7), integer (9, 7) and (6, 10)
filters have been included in [1]. Furthermore,
since the flipping transformation Changes the
round-off noise considerably, techniques to address
precision d noise problems have also been
addressed in [1].
Input R1 R0 R2 R0 R1 R0 R2 R0 R1
First stage R3 R5 R3 R5
Second stage R4 R6 R4 R6 R4
HP output R7 R8 R7 R8 1/K HP
LP output
K LP
Fig. 6 Data-flow and registers allocation of the MAC based architecture
z-1
z-1
α α z-1 1 1/α 1/α
z-1
β β z-1 1 1/β 1/β
z-1
γ γ z-1 1 1/γ 1/γ
z-1
δ δ 1 1/δ 1/δ
(a) 1/K K 1/d
HP LP 1/K K
HP LP
Fig 7 A flipping architecture [1]. (a) Original architecture, (b) Scaling the coefficients to reduce the number of
multiplications .
1/β
1/α
1/γ
1/δ
Phase1 Phase2 Phase1 Phase2
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0406-6
3.5 Efficient folded architecture [8]
However, the conventional lifting scheme adopts
the serial operation to process these intermediate
data; thus, the critical path latency is very long. We
know that the way of processing the intermediate
data determines the hardware scale and critical path
latency of the implementing architecture. Since
some intermediate data are on different paths, we
can calculate them in parallel. With this parallel
operation, the critical path latency is reduced, and
the number of registers is decreased. Therefore it is
called as efficient folded. The critical path latency
is reduces up to Tm+Ta.
4. Comparison of performances
We can compare the performances of different
architecture on the basis of hardware requirement
and critical path latency. The hardware complexity
has been described in terms of data path
components. Comparison of different architecture
shown in table I
Table I [8] (FOR 9/7 LIFTING-BASED DWT)
Architecture Multiplier Adder Register Critical path
latency
Control
complexity
Throughput
Rate(per
cycle)
Direct 4 8 6 4Tm+8Ta Simple 2
input/output
Direct +full
pipeline
4 8 32 Tm simple 2
input/output
Folded 2 4 12 Tm+2Ta Medium 1
input/output
Flipping 4 8 4 Tm+5Ta complex 2
input/output
Flipping+
5stage
pipeline
4 8 11 Tm complex 2
input/output
Efficient
folded
2 4 10 Tm+Ta Medium 1
input/output
5. Conclusion
In this paper, we presented comparison of the
existing lifting based implementations of 1-
dimensional Discrete Wavelet Transform. We
briefly described the principles behind the lifting
scheme in order to better understand the different
implementation styles and structure. We provided a
systematic derivation of each architecture and
evaluated them with respect to their hardware and
timing requirements.
Reference
[1] C.T. Huang, P.C. Tseng, and L.G. Chen,
―Flipping Structure: An Efficient VLSI
Architecture for Lifting-Based Discrete Wavelet
Transform,‖ in IEEE Transactions on Signal
Processing, 2004, pp. 1080–1089.
[2] C.J Lian, K.F. Chen, H.H. Chen, and L.G.
Chen, ―Lifting Based Discrete Wavelet Transform
Architecture for JPEG2000,‖ in IEEE International
Tm denotes latency of multiplication, Ta denotes latency of adder
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0406-7
Symposium on Circuits and Systems, Sydney,
Australia, 2001, pp. 445–448.
[3] W.H. Chang, Y.S. Lee, W.S. Peng, and C.Y.
Lee, ―A Line-Based, Memory Efficient and
Programmable Architecture for 2D DWT Using
Lifting Scheme,‖ in IEEE International Symposium
on Circuits and Systems, Sydney, Australia, 2001,
pp. 330–33
[4] S. Mallat, ―A Theory for Multiresolution Signal
Decomposition: The Wavelet Representation,‖
IEEE Trans. Pattern Analysis and Machine
Intelligence, vol. 11, no. 7, 1989, pp. 674–693.
[5] T. Acharya and P. S. Tsai, JPEG2000 Standard
for Image Compression: Concepts, Algorithms and
VLSI Architectures. John Wiley & Sons, Hoboken,
New Jersey, 2004.
[6] I. Daubechies and W. Sweldens, ―Factoring
Wavelet Transforms into Lifting Schemes,‖ The J.
of Fourier Analysis and Applications, vol. 4, 1998,
pp. 247–269.
[7] C.C. Liu,Y.H. Shiau, and J.M. Jou, ―Design and
Implementation of a Progressive
Image Coding Chip Based on the Lifted Wavelet
Transform,‖ in Proc. of the 11th VLSI Design/CAD
Symposium, Taiwan, 2000
[8] Weifeng Liu, Li Zhang, and Fu Li ―An Efficient
Folded Architecture for Lifting-Based Discrete
Wavelet Transform‖ IEEE TRANSACTIONS ON
CIRCUITS AND SYSTEMS—II: EXPRESS
BRIEFS, VOL. 56, NO. 4, APRIL 2009.
[10] Xiaonan Fan,Zhiyong pang,DihuChen ,H.Z.
Tan ― A Pipeline Architecture for 2-D Lifting-
based Discrete Wavelet Transform of JPEG2000‖
supported by the National Natural Science
Foundation of China under grant No. 60874060
/$26.00 ©2010 IEEE.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0407-1
A Novel Approach in Image De-noising for Salt
& Pepper Noise J S Bhat
1, B N Jagadale
2 and Lakshminarayan H K
2
1 Dept. of Physics, Karnatak University, Dharwad, India
Email: [email protected] 2 Dept. of Electronics, Kuvempu University, Shankaragatta, India
Email: [email protected]; [email protected]
Abstract-The de-noising of an image corrupted by
salt and pepper has been a classical problem in
image processing. In the last decade, various
modified median filtering schemes have been
developed, under various signal/noise models, to
deliver improved performance over traditional
methods. In this paper a simple method called
Inerpolate Median Filter (IMF) is proposed to
restore the images corrupted by salt and pepper
noise. The proposed method works better in
preserving image details by suppressing noise. The
experimental results show that the proposed
algorithm outperforms the conventional Median
filter and other algorithms like mimum-
maximumum exclusive mean filter (MMEM),
Adaptive median filtering(AMF) in terms of signal
to noise ratio.
Key words- Image de-noising, Interpolate median
filter, nonlinear filter, salt & pepper noise
I. INTRODUCTION
An image is often corrupted by noise
during its acquisition and transmission.
Image de-noising is used to reduce the noise
while retaining the important features in the
image. Always there exists a tradeoff
between the removed noise and the blurring
in the image. The intensity of impulse noise
has the tendency of being either relatively
high or relatively low, which will degrade
the image quality. Therefore image de-
noising is used as preprocessing to edge
detection, image segmentation and object
recognition etc.
A variety of filtering techniques has been
proposed for enhancing images degraded by
noise. The classical linear digital image
filters, such as averaging lowpass filters,
tend to blur edges and other fine image
details. Therefore nonlinear filters [1, 2] are
most preferred over linear filters due to their
improved filtering performance in terms of
noise suppression and edge preservation.
The standard median (SM) filter [3] is the
one of the most robust nonlinear filters,
which exploits the rank-order information of
pixel intensities within filtering window.
This filter is very popular due to its edge
preserving characteristics and its simplicity
in implementation. Various modifications of
the SM filter have been introduced, such as
the weighted median (WM) [4] filter. By
incorporating noise detection mechanism
into the conventional median filtering
approach, the filters like switching median
filters [5, 6] had shown significant
performance improvement. The median
filter, as well as its modifications and
generalizations[7] are typically implemented
invariably across an image. Examples
include the mimum-maximumum exclusive
mean filter (MMEM)[8], Florencio‟s [9],
Adaptive median filter(AMF)[10]These
filters have demonstrated excellent
performance but the main drawbacks of all
these filters are, they are prone to edge
jitters in the cases where noise density is
high, large widow size results in blurred
images and significant computational
complexity. To solve this problem, a
modified median filter algorithm called
Interpolate Median filter that employs
Interpolated search in determining the
desired central pixel value is proposed.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0407-2
The paper is organized as follows: Section
II gives brief review of mean and median
filtering. The new approach, The Interpolate
Median filter technique is explained in
section III. Experimental results are
presented in section IV. Finally in section V,
we give the conclusion.
II MEAN & MEDIAN FILTERING
MEAN FILTER
Mean filtering is a simple and easy to
implement method of smoothing images, i.e.
it reduces the amount of intensity variation
between one pixel and the next. It is often
used to reduce noise in images.
The idea of mean filtering is simply to
replace each pixel value in an image with
the mean (`average') value of its neighbors,
including itself. The drawback of this
algorithm is, it has the effect of eliminating
pixel values which are unrepresentative of
their surroundings. With salt and pepper
noise, image gets smoothed with a 3×3
mean filter. Since the shot noise pixel values
are often very different from the surrounding
values, they tend to significantly distort the
pixel average calculated by the mean filter.
MEDIAN FILTER
The median filter is normally used to
reduce noise in an image like the mean
filter; however, it does well in preserving
useful details in the image. Unlike the mean
filter, the median filter considers each pixel
in the image and instead of simply replacing
the pixel value with the mean of neighboring
pixel values; it is replaced with the median
of those values. The median is calculated by
first sorting all the pixel values from the
surrounding neighborhood into numerical
order and then replacing the pixel being
considered with the middle pixel value. The
median filter, especially with larger window
size destroys the fine image details due to its
rank ordering process. Figure1. illustrates an
example calculation.
Neighborhood values: 115, 119, 120, 123,
124, 125, 126, 127, 150
Median value: 124
Fig. 1. Calculating the median value of a 3x3 pixel
neighborhood. The central pixel value of 150 is rather
unrepresentative of the surrounding pixels and is
replaced with the median value: 124
III INTERPOTATE MEDIAN FILTER
The Interpolate Median filter method
considers each pixel in the image in turn and
looks at its neighbors to decide whether or
not it is representative of its surroundings.
Instead of replacing the pixel value with the
median of neighboring pixel values, it
replaces it with the interpolation of those
values.
The interpolation is calculated by first
sorting all pixel values from surrounding
neighborhood into numerical order and then
replacing the pixel being considered with
the interpolation pixel value. The calculation
of interpolation value is derived from the
110 125 125 130 140
123 124 126 127 136
114 120 150 125 134
118 115 119 123 134
111 116 111 120 131
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0407-3
Interpolation search technique used for
searching the elements. We can also call it a
Non- linear filter or order-static filter
because there response is based on the
ordering or ranking of the pixels contained
within the mask. The advantages of this
filter over mean and median filter are, it
gives more robust average than both the
methods, for some pixels in the
neighborhood; it creates new pixel values
like mean filter and for some it will not
create new pixel value like median filter, It
has the characteristics of both filters.
The algorithm uses the fallowing formula
2/])[])[( halaKey (1)
where K is the „key‟, Here we make an
intelligent guess about „key‟ which is the
mid value of the array „a‟, and ][],[ hala are
values of bottom and top elements in the
sorted array.
]))[][/(])[((*)( lahalaKlhlMid (2)
Here value „Mid‟ gives the optimal mid-
point of the array and a[mid] gives the
interpolated value. This interpolated value is
the new value of the pixel
IV EXPERIMENTAL RESULTS
To validate proposed method, the
experiments are conducted on some natural
grayscale test images like Lena, Barbera and
Goldhill of size 512*512 at different noise
levels Table 1, illustrates the PSNRs of the
six de-noising methods. The peak signal-to-
noise ratio (PSNR) in decibels (dB), is
defined as
)(255
log102
dBMSE
PSNR (3)
with 1
0
1
0
2),(),(
1 m
i
n
j
jiKjiImn
MSE
(4)
where I and K being the original image and
denoised image, respectively. Figure 2,
shows the original test images used for
experiments and Figure 3, shows the Lena
image corrupted by salt and pepper noise by
20% (dB).
a b c
Fig.2 The original test images with 512x512 pixels:
(a) Lena; (b) Barbara; (c) Goldhill.
Fig 3. Lena image corrupted by salt & pepper
noise(dB) (20%)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0407-4
Table 1. PSNR Performance of Different Algorithms
for Lena image corrupted with salt and pepper noise
Algorithm Noise Density in dB 10% 20% 30%
MF(3x3) 31.19 28.48 25.45
MF(5x5) 29.45 28.91 28.43
MMEM [8] 30.28 29.63 29.05
Florencio‟s [9] 33.69 32.20 30.95
AMF(5x5) [10] 30.11 28.72 27.84
IMF(Proposed) 33.86 30.59 25.75
V CONCLUSION
In this paper, the proposed algorithm
called Interpolate Median filter employs
Interpolated search in determining the
desired central pixel value. Interpolation
mean filtering is a simple, and easy to
implement, for image de-noising.the
simulation results show that the proposed
method performs significantly better than
many other existing methods
REFERENCES
[1] R. Boyle and R. Thomas Computer Vision: A
First Course, Blackwell Scientific
Publications, 1988, pp 32 - 34. [2] E. Davies Machine Vision: Theory, Algorithms
and Practicalities, Academic Press, 1990, Chap. 3.
[3] I. Pitas and A. N. Venetsanopoulos, “Order
statistics in digital image processing,” Proc.
IEEE, vol. 80, no. 12, pp. 1893–1921, Dec.
1992.
[4] D. R. K. Brownrigg, “The weighted median
filter,” Commun. ACM, vol. 27, no. 8, pp. 807–
818, Aug. 1984. [5] H. Hwang and R. A. Haddad, “Adaptive
median filters : New algorithms and results,” IEEE Trans. Image Process., vol. 4, no. 4, pp.499–502, Apr. 1995.
[6] A. Bovik, Handbook of Image & Video
Processing, 1st Ed. New York: Academic, 2000.
[7] http://homepages.inf.ed.ac.uk [8] W. Y. Han and J. C. Lin, “Minimum–
maximum exclusive mean (MMEM) filter to remove impulse noise from highly corrupted images,” Electron. Lett., vol. 33, no. 2, pp. 124-125, 1997.
[9] T. Sun and Y. Neuvo, “Detail-preserving median based filters in image processing,” Pattern Recognit. Lett., vol. 15, no. 4, pp. 341–347, Apr.1994.
[10] A. Sawant, H. Zeman, D. Muratore, S. Samant, and F. DiBianka, “An adaptive median filter algorithm to remove impulse noise in X-ray and CT images and speckle in ultrasound images,” Proc.SPIE vol. 3661,pp. 1263–1274, Feb. 1999.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0427-1
Content Based Image Retrieval System for Medical
Images
Prof.K.Narayanan1, Shaista Khan
2
Asst.Professor, Fr.Agnel College of Engg., University of Mumbai, India, [email protected]
Fr.Agnel College of Engg., University of Mumbai, India [email protected]
Abstract: The rapid development of technologies and steady
growing amounts of digital information highlight the need of
developing an accessing system. Content-based image indexing
and retrieval has been an emerging research area from the last
few decades. In this, the project approaches content based image
retrieval using low level features such as color, shape and texture
to investigate samples of blood cells through the images to aid
diagnosing disease by identifying similar cases in a medical
database. Medical images are classified in terms of diseases and
by using query image the relevant image is retrieved along with
the classification of disease. The histogram of red, green, and
blue color components is analyzed. The wavelet decomposition is
also used to analyze texture. In addition, morphological
operations such as opening and closing are applied to analyze
object shape. Lastly, color, texture, and shape in image retrieval
are integrated in order to increase the retrieval accuracy.
Keywords: Text Based Image Retrieval (TBIR), Content Based
Image Retrieval (CBIR)
I. INTRODUCTION
In today world the word knowledge has exchanged its
meaning with the information and hence to the data. In
addition to it the rapid development of technologies in digital
field and computing hardware makes the digital acquisition of
information to be more in demand and popular.
Consequently many digital images are being captured and
stored such as medical images, architectural and engineering
images, advertising, design and fashion images, etc., and as a
result large image databases are being created and used in
many applications. However, the focus of our study is on
medical images in this work. A large number of medical
images in digital format are generated by hospitals and
medical institutions every day. So, how to make use of this
huge amount of images effectively becomes a challenging
problem.
In order to overcome this problem the most common
approach that had been used previously for image retrieval
from a database was Text Based Image Retrieval (TBIR).
But later introduced image retrieval based on content
which is known as Content Based Image Retrieval (CBIR). In
TBIR, all medical images are labeled with text which is
manmade and may be different for individuals for the similar
images. Another drawback of TBIR is that all images
especially medical images are difficult to be described by text.
Drawback of TBIR can be overcome by CBIR.
In CBIR, the features from images are extracted using
different methods. The features include color, texture and
shape. Color histogram is the main method to represent the
color information of the image. A method called the pyramid-
structured wavelet transform for texture classification is used.
The number of oval objects in the query image is calculated
using a simple metric and the images are compared with one
another based on those extracted features. These three features
are integrated into one method to improve the retrieval
efficiency. Those images which have similar features would
have similar content as well. Focus of this project is on
medical diagnosis in which CBIR can be used to detect the
disease by identifying similar cases in a medical database.
II. PROPOSED METHOD
Content-based Image Retrieval (CBIR) consists of
retrieving the most visually similar images to a given query
image from a database of images. CBIR from medical image
databases does not aim to replace the physician by predicting
the disease of a particular case but to assist him/her in
diagnosis. The visual characteristics of a disease carry
diagnostic information and oftentimes visually similar images
correspond to the same disease category. By consulting the
output of a CBIR system, the physician can gain more
confidence in his/her decision or even consider other
possibilities.
However, due to the existence of a large number of
medical image acquisition devices, medical images are
distinct and require a specific design of CBIR systems. The
goals of medical information systems have been defined to
deliver the needed information at the right time, the right place
to the right person in order to improve the quality and
efficiency of care processes. In the medical domain, images
from the same disease class as the query image must be
retrieved in order to help the doctor in diagnosis. The images
in the medical database are labeled by a specialist to ensure
that they are less subjective than those of the generic CBIR.
Figure 1 represents the framework of the CBIR system. This
level of retrieval is based on the primitive features. The
following are some of the primitive features such as
Color
Texture
Shape or the spatial location of image element.
A. COLOR ANALYSIS
Color is one of the most important features that make the
image recognition possible by human. It is a property that
depends on the reflection of light to the eye and the processing
of that information in the brain. Color will be used every day
to differentiate objects, places, etc. where colors are defined in
three dimensional color spaces such as RGB (Red, Green, and
Blue), HSV(Hue, Saturation, and Value) or HSB (Hue,
Saturation, and Brightness). Most image formats use the RGB
color space to store information. Most image formats such as
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0427-2
JPEG, BMP, GIF, use the RGB color space to store
information.
Figure: 1 Proposed CBIR System
The RGB color space is defined as a unit cube with red,
green, and blue axes. Thus, a vector with three co-ordinates
represents the color in this space which represents black when
all of them set to zeros and represents white when all three
coordinates are set to 1.
1) Algorithm for Color Analysis:
i. Color histograms of query image and images in a
database are calculated and put them into two
different vectors.
ii. Use this vector to calculate Bhattacharya
coefficient of query image with each image in
data base.
iii. The Bhattacharya coefficient is 1 for completely
similar image and 0 indicates that there is no
similarity in two images. It ranges from 0 to 1.
In CBIR, color histogram is the main method to represent
the color information of the image. A color histogram is a type
of bar graph, where each bar represents a particular color of
the color space being used. A histogram is a probability
density function. It represents discrete frequency distribution
for a grouped dataset, which includes different discrete values
that are grouped into a number of intervals [12]. An image
histogram refers to the probability density function of the
image intensities. This is extended for color images to capture
the intensities of the three-color channels.
In this project the color histograms of query image and
images in a database are calculated and put them into two
different vectors and compare them using Bhattacharya
coefficient. The Bhattacharya coefficient is an approximate
measurement of the amount of overlap between two statistical
samples. The coefficient can be used to determine the relative
closeness of the two samples being considered. n
i
biaiyaCoeffBhattachar1
)( (1)
Where considering the samples a and b, n is the number
of partitions, and ai, bi are the number of members of samples
a and b in the ith
partition. The Bhattacharya coefficient will
range from 0 to 1 where 1 represents the completely similar
image and 0 indicates that there is no similarity in two images
[9].
B) TEXTURE ANALYSIS:
A texture is a measure of the variation of the intensity of a
surface, quantifying properties such as smoothness, coarseness
and regularity. The most popular representation of texture is
Wavelet Transform.A method called the pyramid-structured
wavelet transform for texture classification is used. It
decomposes sub-signals in the low frequency channels
recursively. It is mainly trivial for textures with dominant
frequency channels. For this reason, it is mostly suitable for
signals consisting of components with information
concentrated in lower frequency channels. Since most of the
information exists in lower sub band of the image due to the
natural image properties, the pyramid-structured wavelet
transform is highly sufficient. Using the pyramid structured
wavelet transform, [6] the texture image is decomposed into
four sub images, in low-low, low-high, high-low and high-
high sub-bands. At this point, the energy level of each sub-
band is calculated which is the first level decomposition. In
this study, fifth level decomposition is obtained by using the
low-low sub-band for further decomposition. The reason for
this is the basic assumption that the energy of an image is
concentrated in the low-low band. For this reason the wavelet
function used is the Daubechies wavelet.
1) Algorithm for Texture Analysis:
i. Decompose the image using pyramid –
structures Wavelet Transform (till fifth level
decomposition).
ii. Build a histogram of the transformed image
coefficients in each sub band.
iii. Calculate signature Vector for each image by
concatenation of these histograms.
iv. Compute L1- distance using equation 2 of Query
image with all images in data base.
In order to characterize the image texture at different
scales, the distribution of the wavelet coefficients in each sub
band of such decomposition is characterized by an image
signature. An image signature is defined by building a
histogram of the transformed image coefficients in each sub
band. As images are decomposed with a pyramidal scheme on
Nl levels, they consist of 3 * Nl + 1 sub bands: there are 3 sub
bands of details at each scale l <= Nl (lHH, lHL and lLH) plus
an approximation (NlLL), 3*Nl+1 histograms are thus built.
The signature is a vector formed by the concatenation of these
histograms. The distance used to compare two images Im1
and Im2 based on the L1-distance between histograms or 2
signatures.
The distance measure is given
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0427-3
3 11 2
1 2
1
(Im , Im ) ( )lN
t t
t
d t H H (2)
1 2 1 2
1
( ) ( )NB
t t t t
j
H H H j H j
Where ( )n
tH jthe value of the j
th bin of the i
th is normalized
histogram of image n and 1 3 1l
t t N is a set of tunable
weights.
C) SHAPE ANALYSIS
Shape may be defined as the characteristic surface
configuration of an object; an outline or contour. It permits an
object to be distinguished from its surroundings by its outline.
1) Algorithm for cell geometry analysis:
i. Convert the image to black and white in order to
prepare for boundary tracing using
bwboundaries and threshold the image.
ii. Remove the noise.
iii. Find the boundaries.
iv. Determine number of oval objects in Query
image and all the images in database.
Based on the domain in this project which is blood cell
images, the number of round objects in the image needs to be
determined; to achieve this Convert the image to black and
white in order to prepare for boundary tracing using
bwboundaries function in MATLAB.
Then morphological operator such as opening is used to
remove the small connected objects which do not belong to
the objects of interest. The result of area and perimeter of an
object inside each image is used to form a simple metric
indicating the roundness of an object using the following
formula:
2
4
Perimeter
areaMetric
P
(3)
This metric is equal to one only for a circle and it is less
than one for any other shape. The discrimination process can
be controlled by setting an appropriate threshold. Here
threshold is taken 0.7.
The shape is an important feature as diseases are
classified depending on the shape of cell for example Sickle-
cell disease, or sickle-cell anaemia, is an autosomal co-
dominant genetic blood disorder characterized by red blood
cells that assume an abnormal, rigid, sickle shape, so for this
disease.
For cell geometric analysis, once the number of oval
objects in the query image is calculated, its value will be
compared with all the value of number of oval objects in all
the images in database. Then the images which are close to
query images will be displayed.
Then combine result of all three algorithms and then
sorted to give best search result along with disease.
III. RESULT
In our classification system, the ground truth database is
made of 25 blood cell images with two different
classifications. Classification is based on type of disease i.e.,
sickle cell disease and cancer disease.
Sickle Cell disease is hereditary Blood disease
resulting from a single amino acid mutation of the red
blood cells. A blood condition of anemia. People with
sickle cell disease have red blood cells that contain
mostly hemoglobin S, an abnormal type of
hemoglobin. Sometimes these red blood cells become
crescent shaped "sickle shaped".
Cancer of the myeloid line of blood cells,
characterized by the rapid growth of abnormal white
blood cells.
In order to increase the accuracy of retrieval result in
the proposed system, the result of color, texture and cell
geometric are combined so that only images which are
common in all the above three feature extraction will be
shown as final result. The advantages of this system are
high accuracy and precision as well as simplicity of the
algorithm.
Query image is blood cell sample image of patient for
diagnose of disease. Search result shows type of disease
patient is suffering from. If patient is not suffering from
these two diseases then result will be shown as patient is
not suffering.
Figure: 2 Result shows patient is suffering from disease or not.
IV. CONCLUSION
The rapid growth in the sizes of image databases
highlights the need of developing an effective and efficient
retrieval system. This development started with retrieving
images using textual annotation called TBIR but later
introduced image retrieval based on content which is known
as CBIR.CBIR overcome the drawbacks of TBIR
Our focus is on medical diagnosis in which is CBIR can be
used to aid diagnosis by identifying similar past cases in a
medical database of medical images mainly blood cell images.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0427-4
These images are classified in terms of diseases and images
from the same disease class as the query image must be
retrieved in order to help the doctor in diagnosis.
This work investigates the approaches of CBIR based on
the low level features such as color, shape and texture
analysis. In order to increase the accuracy of retrieval result of
color, texture and shape is combined and the result is shown.
To diagnose disease considered 25 blood cell images in the
database which are classified based on type of disease for
example sickle cell disease and cancer disease. For a given
query image, retrieved image from database shows patient is
suffering from which type of disease.
REFERENCES [1] “Old fashion text-based image retrieval uses FCA” by Ahamd, I.;
Taek-Sueng Jang, published in Image Processing, 2003.ICIP
2003.Proceedings.2003 International Conference on Image Processing.
[2] “ Content based medical image retrieval based on pyramid structure
wavelet” by Aliaa.A.A.Youssif*, A.A.Darwish an R.A.Mohamed published in IJCSNS International Journal of Computer Science and
Network Security, VOL.10 No.3, March 2010
[3] “Content-based image retrieval from large medical databases” by Kak, A. Pavlopoulou, C. published in 3D Data Processing Visualization and
Transmission, 2002,Proceedings in First International Symposium.
[4] “An Adaptive, Knowledge-Driven Medical Image Search for Interactive Diffuse Parenchymal Lung Disease Quantification” by Yimo Tao,
Xiang Sean Zhou.
[5] “WEB-BASED MEDICAL IMAGE RETRIEVAL SYSTEM” by Ivica Dimitrovski, Dejan Gorgevik, Suzana Loskovska.
[6] Paper on “Wavelet Optimization for Content-Based Image Retrieval in Medical Database “by G. Quellec M. Lamard, G. Cazuguel B.
Cochener, C. Roux.
[7] “Application of Wavelet Transform and its Advantage Compared to Fourier Transform” by M. Sifuzzaman1, M.R. Islam1 and M.Z Ali
Journal of Physical Sciences, Vol. 13, 2009, 121-134.
[8] “Automatic Detection of Red Blood Cells in Hematological Images Using Polar Transformation and Run-length Matrix” by S. H. Rezatofighi*, A.
Roodaki, R. A. Zoroofi R. Sharifian H. Soltanian-Zadeh published in
ICSP2008 Proceedings. ( 978-1-4244-2179-4/08/$25.00 ©2008 IEEE) [9] “Content-based Image Retrieval for Blood Cells” by Mohammad Reza
Zare, Raja Noor Ainon, Woo Chaw Seng, published in 2009 Third Asia
International Conference on Modelling & Simulation. [10] “Digital Image Search & Retrieval uses FFT Sectors of Color Images”
by H. B. Kekre, Dhirendra Mishra published in International Journal on
Computer Science and Engineering. [11] “Content Based Image Retrieval using Contourlet Transform” by
Ch.Srinivasa rao ,S. Srinivas kumar , B.N.Chatterji in ICGST-GVIP
Journal, Volume 7, Issue 3, November 2007. [12] Paper on “Discrete Wavelet Transforms: Theory and Implementation
“by Tim Edwards.
[13] “A Content-Based Retrieval System for Blood Cells Images” by Woo Chaw Seng and Seyed Hadi Mirisaee in 2009 International Conference
on Future Computer and Communication.
[14] “A CBIR METHOD BASED ON COLOR-SPATIAL FEATURE” by
Zhang Lei, Lin Fuzong, Zhang Bo.
1
AUDIO +
Abhay KumarResearch Scholar at Associated Electronics Research Foundation, Phase-II Noida (U.P.)
Abstract--AUDIO+ is an electronic device that alter how a musical instrument or other audio source sounds and can be best termed as a “Digital Effect Processor”. Some effects subtly "colour" a sound, while others transform it dramatically. Effects can be used during live performances (typically with keyboard, electric guitar or bass) or in the studio i.e. the faithful reproduction of the sound signals is heard when AUDIO+ is used in the audio line.
AUDIO+ has a unique quality to modify the sound signals and make it soothing to every human ear. The device is provided with the control panel of “Volume”, “Bass”, “Treble” and “Balance” to make it desirable for ear sensitive to high and low frequency sound. AUDIO+ is easy to use portable device with single signal input/output port and an internal power supply with batteries.
Keywords: Digital audio players, Digital signal processors, Mixed analog digital integrated circuits, Digital filters, Equalizers, Digital controls.
I. INTRODUCTION
AUDIO+ is all about the musical sound box, which can take the raw mp3, mpeg data and process it digitally. What is interesting that it can sample and play many sound formats starting from sampling rate of 8 kHz to 96 kHz which is more than enough to play any sound format. It improves Sound quality with significant reduction of noise and Dolby sound effects.
II. SYSTEM DESCRIPTION
AUDIO+ is built around the combination of IC’s from Texas Instruments and National Instruments. DRV134 and INA2134 from Texas Instruments are used to design a circuit which enhances sound performance.
This project is supported by Associated Electronics Research Foundation.
Mr. Abhay Kumar is with Associated Electronics Research Foundation, C-53, Phase-II, Noida (U.P.) as a Research Scholar(Phone No.-+919650109759, [email protected])
Very low distortion, low noise, and wide bandwidth provide superior performance in high quality audio applications.
LM1036 of the National Instruments is a DC controlled tone (bass/treble), volume and balance circuit for stereo applications in car radio, TV and audio systems. An additional control input allows loudness compensation to be simply effected.
III. DRV134
DRV134 is a differential output amplifiers that convert a single-ended input to a balanced output pair. These balanced audio drivers consist of high performance op amps with on-chip precision resistors. They are fully specified for high performance audio applications, including low distortion (0.0005% at 1 kHz). Wide output voltage swing and high output drive capability allow use in a wide variety of demanding applications. They easily drive the large capacitive loads associated with long audio cables. Laser-trimmed matched resistors provide optimum output common-mode rejection (typically 68dB), especially when compared to circuits implemented with op amps and discrete precision resistors. In addition, high slew rate (15V/μs) and fast settling time (2.5μs to 0.01%) ensure excellent dynamic response. The DRV134 has excellent distortion characteristics. Noise is below 0.003% throughout the audio frequency range under various output conditions. The gain of 6dB is seen at the output of the differential amplifier.
Fig 1: Gain Vs Frequency graph for DRV134
2
IV. INA2134
INA2134 differential line receivers consisting of high performance op amps with on chip precision resistors. They are fully specified for high performance audio applications and have excellent
LM1036 provide user a compatibility to control the component of sound with the help of multi-turn potentiometer. Graphs given below illustrate the different control operation.
ac specifications, including low distortion (0.0005% at 1 kHz) and high slew rate (14V/ms), assuring good dynamic response. In addition, wide outputvoltage swing and high output drive capability allow use in a wide variety of demanding applications. The dual version features completely independent circuitry for lowest crosstalk and freedom from interaction, even when overdriven or overloaded. The INA2134 on-chip resistors are laser trimmed for accurate gain and optimum common-mode rejection. . It has a unity gain.
Fig 2: Gain Vs Frequency graph for INA2134
V. LM1036
LM1036 has a four control inputs provide control of the bass, treble, balance and volume functions through application of DC voltages from a remote control system or, alternatively, from four potentiometers which may be biased from zener regulated power supply. LM1036 has the following features:
Large volume control range, 75 dB typical Tone control, ±15 dB typical Channel separation, 75 dB typical Low distortion, 0.06% typical for an input
level of 0.3 V RMS High signal to noise, 80 dB typical for an
input level of 0.3Vrms
Fig 3: Volume control LM1036
Fig 4: Tone control LM1036
Fig 5: Balance control LM1036
3
VI. DRV 134 SIMULATION
.
T
Input voltage (V)
0.00 10.00 20.00 30.00 40.00 50.00
Outp
ut
-3.00
-1.50
0.00
1.50
3.00
Fig 8: DC analysis of DRV 134
The Fig 8 shows how the input at DRV134 can be balanced and input line can be modulated.
Fig 6: TINA-TI simulation window for DRV134
The above result shows how a circuit can be built on Tina-TI software of DRV 134.The input to the circuit has to be in the range of 8 kHz-96 kHz and the input voltage should be 200mVrms to 2Vrms. The result can be judged by taking the voltage at the VM1 and VM2. The output is balanced owing to DRV134 acts as a balance modulator.
T
Frequency (Hz)
1 10 100 1k 10k 100k 1M
Outp
ut
nois
e (
V/H
z?)
0.00
10.00u
20.00u
30.00u
Fig 7: Noise analysis of DRV 134
The above figure shows the noise analysis of the DRV134 circuit. The noise significantly reduces as the frequency increases.
VII. SIMULATION OF DRV 134 WITH
INA 137
Fig 9: TINA-TI simulation window for DRV134 and INA
137
The above diagram shows that how the balanced output can be amplified and two channels can be made using INA137 (Gain=1/2) and INA134 (Gain=1).
4
T
Frequency (Hz)
1 10 100 1k 10k 100k 1M
Outp
ut
nois
e (
V/H
z?)
0.00
100.00n
200.00n
300.00n
400.00n
Fig 10: Analysis of DRV 134 with INA 137
The above graph shows how the noise can be significantly reduced after the introduction of INA137 or INA134. This shows that how the input signal can be balanced and amplified to reduce the noise affect to the desired one.
T
Input voltage (V)
0.00 25.00 50.00 75.00 100.00
Voltage (
V)
-3.00
-1.50
0.00
1.50
3.00
Fig 11: DC analysis of DRV 134 with INA 2137
The above fig 11 shows that the output voltage range between 200mVrms to 2Vrms and the sampling frequency of 8 kHz to 96 kHz.
VIII. CONCLUSION
AUDIO+ maintains the originality of five major components of sound signals:
a. Pitch: the frequency of sound signals. Low frequencies (Bass): Make
the sound powerful. Midrange frequencies: Give
sound its energy. Human beingare more sensitive to midrange frequencies.
High frequencies (Treble): Give sounds its presence and life like quality and lets us feel that we are close to sound source.
b. Timbre: Timbre is that unique combination of fundamental frequency, harmonics, and overtones that give each voice, musical instrument, and sound effect its unique colouring and character.
c. Harmonics: When a object vibrates it propagates sound waves of a certain frequency. This frequency, in turn, sets in motion frequency waves called harmonics.
d. Loudness: The loudness of a sound depends on the intensity of the sound stimulus.
e. Rhythm: Rhythm is a recurring sound that alternates between strong and weak elements.
In combination to the above all components of sound present AUDIO+ concentrate on the high frequencies with 6dB overall gain and gives presence of the original reproduction of sound and thus it is more useful for high quality audio system and long distance telephonic calls.
IX. FUTURE WORK
The AUDIO+ has a great advantage in audio system and audio communication. That’s why an opportunity to use in digital communication and VOIP phone.
X. REFRENCES
1) Software support and information about the digital speakers reveal from: Texas Instrument ( www.TI.com)
2) Audio www.ti.com/audio3) Data Converters dataconverter.ti.com4) DSP dsp.ti.com 5) Digital Control www.ti.com/digitalcontrol6) Clocks and Timers www.ti.com/clocks7) Logic logic.ti.com 8) Power Mgmt power.ti.com9) Microcontrollers microcontroller.ti.com10) Hardware support from: Farnell India (http://in.farnell.com/)11) Audio codec www.ti.com/tlv320aic3101.pdf12) Audio digital processor www.ti.com/tas3103.pdf13) Audio line driver www.ti.com/drv134.pdf14) Input amplifier www.ti.com/ina2134.pdf15) Voltage regulator www.ti.com/tps62007.pdf,
www.ti.com/tps74801.pdf, www.ti.com/tps74701.pdf,16) Control IC www.national.com
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-1
Speaker Identification
Prerana & Aditi Choudhary
Abstract-Humans use voice recognition
everyday to distinguish between speakers
and genders. Other animals use voice
recognition to differentiate among sounds
sources Speaker recognition is the process
of automatically recognizing who is
speaking on the basis of individual
information included in speech waves. This
technique makes it possible to use the
speaker's voice to verify their identity and
control access to services such as voice
dialing, banking by telephone, telephone
shopping, database access services,
information services, voice mail, security
control for confidential information areas,
and remote access to computers
Speaker identification has been a wide and
attractive area of research. Many works
based on speech features, were proposed. In
a speaker recognition system there are three
important components; the feature extraction
component, the speaker models and the
matching algorithm.
The speech signal conveys information
about the identity of the speaker. The area of
speaker identification is concerned with
extracting the identity of the person
speaking the utterance. As speech
interaction with computers becomes more
pervasive in activities such as the telephone,
financial transactions and information
retrieval from speech databases, the utility
of automatically identifying a speaker is
based solely on vocal characteristic.
FEATURES OF SPEECH
One might wonder what information is
needed to classify between genders or to
classify the speech of multiple speakers. In
fact, speech contains a great deal of
information that allows a listener to
determine both gender and speaker identity.
In addition, speech can reveal much about
the emotional state and age of the speaker.
For example, an Israeli engineer created a
signal processing lie detector system that out
performs the traditional polygraph test.
PITCH
Pitch is the most distinctive difference
between male and female speakers. A
person’s pitch originates in the vocal
cords/folds, and the rate at which the vocal
folds vibrate is the frequency of the pitch.
So, when the vocal folds oscillate at 300
times per second, they are said to be
producing a pitch of 300 Hz. When the air
passing through the vocal folds vibrates at
the frequency of the pitch, harmonics are
also created. The harmonics occur at integer
multiples of the pitch and decrease in
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-2
amplitude at a rate of 12 dB per octave – the
measure between each harmonic .
The reason pitch differs between sexes is the
size, mass, and tension of the laryngeal tract
which includes the vocal folds and the
glottis (the spaces between and behind the
vocal folds). Just before puberty, the
fundamental frequency, or pitch, of the
human voice is about 250 Hz, and the vocal
fold length is about 10.4 mm. After puberty
the human body grows to its full adult size,
changing the dimensions of the larynx area.
The vocal fold length in males increases to
about 15-25 mm while female’s vocal fold
length increases to about 13-15 mm. These
increases in size correlate to decreased
frequencies coming from the vocal folds. In
males, the average pitch falls between 60
and 120 Hz, and the range of a female’s
pitch can be found between 120 and 200 Hz.
Females have a higher pitch range than
males because the size of their larynx is
smaller. However, these are not the only
differences between male and female speech
patterns .
FORMANT FREQUENCIES
When sound is emitted from the human
mouth, it passes through two different
systems before it takes its final form. The
first system is the pitch generator, and the
next system modulates the pitch harmonics
created by the first system. Scientists call the
first system the laryngeal tract and the
second system the supralaryngeal/vocal
tract. The supralaryngeal tract consists of
structures such as the oral cavity, nasal
cavity, velum, epiglottis, tongue, etc.
When air flows through the laryngeal tract,
the air vibrates at the pitch frequency
formed by the laryngeal tract as mentioned
above. Then the air flows through the
supralaryngeal tract, which begins to
reverberate at particular frequencies
determined by the diameter and length of the
cavities in the supralaryngeal tract. These
reverberations are called “resonances” or
“formant frequencies”. In speech,
resonances are called formants. So, those
harmonics of the pitch that are closest to the
formant frequencies of the vocal tract will
become amplified while the others are
attenuated
INTRODUCTION- Most signal processing
involves processing a signal without concern
for the quality or information content of that
signal. In speech processing, speech is
processed on a frame by-frame basis usually
only with the concern that the frame is either
speech or silence The usable speech frames
can be defined as frames of speech that
contain higher information content
compared to unusable frames with reference
to a particular application. We have been
Input
speech
Feature
extraction
Reference
model
(Speaker #1)
Similarity
Reference
model
(Speaker #N)
Similarity
Maximum
selection
Identification
result
(Speaker ID)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-3
investigating a speaker identification system
to identify usable speech frames. We then
determine a method for identifying those
frames as usable using a different approach.
However, knowing how reliable the
information is in a frame of speech can be
very important and useful.
This is where usable speech detection and
extraction can play a very important role.
The usable speech frames can be defined as
frames of speech that contain higher
information content compared to unusable
frames with reference to a particular
application. We have been investigating a
speaker identification system to identify
usable speech frames .We then determine a
method for identifying those frames as
usable using a different approach.
PARADIGMS OF SPEECH
RECONGITION
1. Speaker Recognition - Recognize which
of the population of subjects spoke a given
utterance.
2. Speaker verification -Verify that a given
speaker is one who he claims to be. System
prompts the user who claims to be the
speaker to provide ID. System verifies user
by comparing codebook of given speech
utterance with that given by user. If it
matches the set threshold then the identity
claim of the user is accepted otherwise
rejected.
3. Speaker identification - detects a
particular speaker from a known population.
The system prompts the user to provide
speech utterance. System identifies the user
by comparing the codebook of speech
utterance with those of the stored in the
database and lists, which contain the most
likely speakers, could have given that
speech utterance.
At the highest level, all speaker recognition
systems contain two main modules (refer to
Figure 1): feature extraction and feature
matching. Feature extraction is the process
that extracts a small amount of data from the
voice signal that can later be used to
represent each speaker. Feature matching
involves the actual procedure to identify the
unknown speaker by comparing extracted
features from his/her voice input with the
ones from a set of known speakers.
Reference
model
(Speaker #M)
SimilarityInput
speech
Feature
extraction
Verification
result
(Accept/Reject)Decision
ThresholdSpeaker ID
(#M)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-4
(b) Speaker verification
Figure 1. Basic structures of speaker
recognition systems
Figure 1 shows the basic structures of
speaker identification and verification
systems. The system that we will describe is
classified as text-independent speaker
identification system since its task is to
identify the person who speaks regardless of
what is saying.
Concepts of speaker identification
systems:
Speaker identification systems may be
classified into two categories based on their
principle of operation.
Text-dependent systems, which make use of
a fixed utterance for test and training and
rely on specific features of the test utterance
in order to affect a match.
Text-independent systems, which make use
of different utterances for test and training
and rely on long term statistical
characteristics of speech for making a
successful identification.
Text-dependent systems require less training
than text-independent systems and are
capable of producing good results with a
fraction of the test speech sample required
by a text-independent system. The pitch
period or fundamental frequency of speech
varies from one individual to another; pitch
frequency is high for female voices and low
for male voices. This suggests that pitch
might be a suitable parameter to distinguish
one speaker from another, or at least to
narrow down the set of probable matches.
The analysis of the frequency spectrum of
the test utterance provides valuable
information about speaker identification.
The spectrum contains both pitch harmonics
and vocal-tract resonant peaks, making it
possible to identify the speaker with a high
probability of being correct. The vocal-tract
filter parameters (filter coefficients) can also
be used to good effect for speaker
identification. This is due to the fact that
different speakers have different vocal-tract
configurations for the same utterance,
depending on their physical and emotional
conditions, as well as whether the speaker is
a native or non-native speaker
In any text-dependent speaker identification
system, an important decision is the choice
of test utterance. The source-filter model is
most accurate at representing voiced sounds,
such as the vowels. Vowels have a definite,
consistent pitch period. The vocal-tract
configuration for vowel-utterances exhibits a
clear formant (resonant) structure. The
frequency spectrum corresponding to vowel-
utterances therefore contains a wealth of
information that can be used for speaker
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-5
identification. In general, it is difficult to
guarantee a hundred percent recognition
even with the best speaker identification
approaches.
Generally speaking, two parameters may be
used to describe the overall performance of
a speakeridentification system.
A false acceptance: Which occurs when the
system incorrectly identifies an unregistered
individual as an enrolled one, or when one
registered individual is mistaken for another.
The FAR (False Acceptance Ratio) is the
ratio of the number of false acceptances to
the total number of trials. The value of FAR
can be reduced by setting a strict low
threshold.
A false rejection: Which occurs when the
system incorrectly refuses to identify an
individual who is registered with the system.
The FRR (False Rejection Ratio) is the ratio
of the number of false rejections to the total
number of trials. Setting the threshold to a
liberal high value can minimize the value of
FRR. The requirements for low FAR and
FRR are seen to be conflicting and both
parameters cannot be simultaneously
lowered. However, a low FAR is vital for
good speaker identification systems and
most systems are biased for good FAR
performance at the expense of FRR.
APPROACHES TO SPEECH
RECOGNITION
1. The Acoustic Phonetic approach
2. The Pattern Recognition approach
3. The Artificial Intelligence approach
A. The Acoustic Phonetic Approach
The acoustic phonetic approach is based
upon the theory of acoustic phonetics that
postulate that there exist a set of finite,
distinctive phonetic units in spoken
language and that the phonetic units are
broadly characterized by a set of properties
that can be seen in the speech signal, or its
spectrum, over time. Even though the
acoustic properties of phonetic units are
highly variable, both with the speaker and
with the neighboring phonetic units, it is
assumed that the rules governing the
variability are straightforward and can
readily be learned and applied in practical
situations. Hence the first step in this
approach is called segmentation and labeling
phase. It involves segmenting the speech
signal into discrete (in Time) regions where
the acoustic properties of the signal are
representatives of one of the several
phonetic units or classes and then attaching
one or more phonetic labels to each
segmented region according to acoustic
properties.
For speech recognition, a second step is
required. This second step attempts to
determine a valid word (or a string of words)
from the sequence of phonetic labels
produced in the first step, which is
consistent with the constraints of the speech
recognition task
B. The Pattern Recognition Approach
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-6
The Pattern Recognition approach to speech
is basically one in which the speech patterns
are used directly without explicit feature
determination (in the acoustic – phonetic
sense) and segmentation. As in most pattern
recognition approaches, the method has two
steps – namely, training of speech patterns,
and recognition of patterns via pattern
comparison. Speech is brought into a system
via a training procedure The concept is that
if enough versions of a pattern to be
recognized (be it sound a word, a phrase etc)
are included in the training set provided to
the algorithm, the training procedure should
be able to adequately characterize the
acoustic properties of the pattern (with no
regard for or knowledge of any other pattern
presented to the training procedure) This
type of characterization of speech via
training is called as pattern classification.
Here the machine learns which acoustic
properties of the speech class are reliable
and repeatable across all training tokens of
the pattern. The utility of this method is the
pattern comparison stage with each possible
pattern learned in the training phase and
classifying the unknown speech according to
the accuracy of the match of the patterns
Advantages of Pattern Recognition
Approach
• Simplicity of use. The method is relatively
easy to understand. It is rich in mathematical
and communication theory justification for
individual procedures used in training and
decoding. It is widely used and best
understood.
• Robustness and invariance to different
speech vocabularies, user, features sets
pattern comparison algorithms and decision
rules. This property makes the algorithm
appropriate for wide range of speech units,
word vocabularies, speaker populations,
background environments, transmission
conditions etc.
• Proven high performance. The pattern
recognition approach to speech recognition
consistently provides a high performance on
any task that is reasonable for technology
and provides a clear path for extending the
technology in a wide range of directions.
C. The Artificial Intelligence Approach
The artificial intelligence approach to
speech is a hybrid of acoustic phonetic
approach and the pattern recognition
approach in which it exploits ideas and
concepts of both methods. The artificial
intelligence approach attempts to mechanize
the recognition procedure according to the
way a person applies intelligence in
visualizing, analyzing and finally making a
decision on the measured acoustic features.
In particular, among the techniques used
within the class of methods are the use of an
expert system for segmentation and labeling.
The use of neural networks could represent a
separate structural approach to speech
recognition or could be regarde
as an implementational architecture that may
be incorporated in any of the above classical
approaches.
FUTURE SCOPE
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-7
A range of future improvements is possible:
• Speech independent speaker identification
• No of user scan be increased
• Identification of a male female child and
adult
REFERENCES
1. R.V Pawar, P.P.Kajave, and S.N.Mali
“Speaker Identification using Neural
Networks”, World Academy of Science,
Engineering and Technology 12 2005.
2. Lawrence Rabiner- “Fundamentals of
Speech Recognition” Pearson Education
Speech Processing Series. Pearson
Education Publication.
3. Brian J. Love , Jennifer Vining , Xuening
Sun “Automatic Speaker Recognition
Using Neural Networks”, Electrical and
Computer Engineering Department The
University of Texas at Austin Spring 2004.
4. Muzhir Shaban Al-Ani, Thabit Sultan
Mohammed and Karim M. Aljebory
“Speaker Identification: A Hybrid Approach
Using Neural Networks and Wavelet
Transform”, Journal of Computer Science 3
(5): 304-309, 2007 ISSN 1549-3636, 2007
Science Publications.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0503-1
Abstract
This Paper focuses on the analysis of the
Film Bulk Acoustic Wave Resonator
(FBAR) comprising of Zinc Oxide (ZnO)
piezoelectric thin film sandwiched
between two metal electrodes of gold (Au)
and located on a silicon substrate with a
low stress silicon nitride (Si3N4)
supporting membrane for high frequency
wireless application. The film bulk
acoustic wave technology is a promising
technology for manufacturing miniaturized
high performance filters for Giga Hertz
range.
Keywords: FBAR, Quartz crystal, APLAC.
Quartz Crystal
Crystal Quartz is the most important
resonator material presently available. It
has been used for 50 years, and thus
growth, characterization, and fabrication
techniques are quite mature. Its low
coupling is usually not a disadvantage
when it is used for frequency control
applications. For reasonable values of
transducer areas, the resistance falls in the
10 –20 ohm range at 5 to 20MHz. This
range is ideal for oscillator circuits. Its Q is
some what lower than that of ferroelectric
materials, but at lower frequencies it is
more than adequate, and because the
stoichmetery of the crystal quartz is simple
and its growth technology well
established, there are a few crystal defects
and the attenuation has frequency squared
dependence. Only when very high
frequencies or wide inductive regions are
required do designers look beyond quartz.
So at higher frequencies e.g. at GHz we
cannot use quartz and FBAR and Saw
devices are used which are much smaller
in size. Quartz also have disadvantage that
it has the limits of the integration with the
mechanical structure and integrated circuit
as compared to silicon and furthermost the
cost of quartz wafers is significantly higher
than that of silicon.[1-7]
FBAR Devices
FBAR stands for Film Bilk Acoustic
Resonator FBAR is a break through
resonator technology being developed by
Agilent technologies.Thus the technology
can be used to create the essential
frequency shooing elements found in
modern wireless systems, including filters,
duplexers and resonators for oscillators.
[1-3]
Why FBAR
The rapid growth of wireless mobile
telecommunication system leads to
increase in demand for high frequency
oscillators, filters and duplexers capable of
operating in GHz frequency band range.
Conventionally Liquid Crystal, microwave
ceramic resonators, transmission lines and
SAW devices have been used as high
frequency band devices. Although they
provide high performance at reasonable
price but they are large in size to be able to
integrate in wireless application. SAW
have better electrical performances and
smaller in size but they had relatively poor
sensitivity to temperature, high insertion
losses and limited power handling.
To cope with these limitations FBAR
devices have been developed and can
easily replace these devices in higher
frequency for wireless communication
applications.A thin film bulk acoustic
wave resonator consists basically of a thin
piezoelectric layer sandwiched between
two electrodes. In such a resonator a
mechanical wave is piezoelectrically is
excited in response to an electric field
applied between the electrodes. The
propagation direction of this acoustic wave
is perpendicular to the surface of the
resonator. For a standing wave situation to
Modeling of FBAR Resonator and Simulation using APLAC
Deepak kumar, Navaid Z.Rizvi,Rajesh Mishra
Gautam Buddha University,Greater Noida
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0503-2
prevail, the acoustic energy has to be
reflected back at the boundaries of the
resonator. This reflectivity can be achieved
by two means, either an air-interface or an
acoustic mirror. Piezoelectric thin films
convert electrical energy into mechanical
energy and vice versa. Film Bulk Acoustic
Resonator (FBAR) consists of a
piezoelectric thin film sandwiched by two
metal layers. A resonance condition occurs
if the thickness of piezoelectric thin film
(d) is equal to an integer multiple of a half
of the wavelength (λres). The fundamental
resonant frequency (Fres=1/ λres) is then
inversely proportional to the thickness of
the piezoelectric material used, and is
equal to Va/2d where Va is an acoustic velocity at the resonant frequency (Fig. 1).
Figure.1
Figure.2
Figure.3
A bulk-micro machined FBAR with TFE
(Thickness Field Excitation) uses a z-
directed electric field to generate z-
propagating longitudinal or compressive
wave.[3-8]
In an LFE-FBAR, the applied electric
field is in y-direction, and the shear
acoustic wave (excited by the lateral
electric field) propagates in z-direction.
One Dimensional Acoustic-Wave
Equation:
The fundamental wave equation related to
the longitudinal acoustic-wave generation
and propagation for one dimensional case
is
Where T, S, c and mo are the mechanical
stress, the mechanical strain, the stiffness
elastic constant and the mass density of the
material, respectively.
From the Hooke’s law
T= c*S (2)
solution of the wave equation for the stress
contain (as common factors) e- j( t+-kz)
where =2pf is the wave frequency and
the k is the propagation constant of the
wave number
K = ( mo /c)1/2
(3)
= /Va
Where Va= (c/ mo)1/2
being the acoustic
impedance .
The acoustic impedance is
Z = -T/v (4)
Where v = –Va T/c is the particl velocity.
Hence, the acoustic impedance Z0 is
Z0 = c/ Va= ( mo c)1/2
= Va mo (5)
Three-port equivalent circuit model:
Consider the two lateral dimensions (in
x and y direction) of the uniform
resonator are very large compared with
the thickness and acoustic. the metallic
electrodes are assumed to be very thin,
providing no mass loading on opposite
surface normal to the z direction.
(1)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0503-3
The following equivalent circuit models
are used widely for FBAR electrical
modeling.
1. Mason equivalent circuit model
2. Redwood equivalent circuit model
3. KLM equivalent circuit model
In this paper the Mason three Port
Equivalent circuit model have been used.
Mason equivalent circuit model Mason’s model has been accepted most
widely used in analyzing vertical structure
of the piezoelectric materials. It is based
on a physical model and uses as its inputs,
dielectric constants, mass densities
constants, mass densities, stiffness
coefficients, from the piezoelectric stress
tensor and thickness of the physical layers.
The model is used for calculating the
fundamental frequency of the resonator as
well as calculating the effective kt2
of the
devices. The vibration characteristics of
the piezoelectric structure can be modeled
as a three port network with one electric-
input and two acoustic output ports. Owing
to the network with one electric-input port
and two acoustic ports. Owing to the
characteristics of the piezoelectric
transducer driven from the coupling of the
electric potential and mechanical stress.
The forces (F) and the particle velocities
(v) at the boundary surfaces of the
resonator are:
F1 = - AT (-d/2) (6) F2= - AT (d/2) (7) v1= v (-d/2) (8) v2= v (-d/2) (9) minus (-) sign indicates the relation of
axis and direction,v1 and v2 means particle
velocities vector exist in the material
surface.Where A, d and T are the area,
thickness and internal stress of the
resonator, respectively
k = ( m/cD)1/2
= /Va (10)
Z0 = ( m cD)1/2
= m cD
(11) TF = - Z0 vF (12)
TB = Z0 vB (13)
Va= (cD / m)
1/2 (14)
Va is being the acoustic wave velocity.
Using boundary conditions : V = -v2sin[k(z+d/2)]+v1sin[k(d/(2-z))]/sin(kd)
(15)
By evaluating the above equations Mason
model of a piezoelectric transducer
(resonator) is obtained.
Where C0 is called the clamped (zero
strain) capacitance or static capacitor of a
transducer (resonator), is
C0=S
A/ d (16)
and Zc is the acoustic impedance of a
transducer with the area A, is
Zc = AZ0 =A m cD
(17)
The matrix equation results can used
to represent in the Mason model
equivalent circuit.
Figure.4
As shown in Fig.4 in this equivalent
circuit , the electric port of transformer
represents the conversion of electrical
energy to acoustic energy, the electric
port of the transformer represents the
conversion of electrical energy to
acoustic energy (or vice versa).
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0503-4
Why APLAC
With the help of APLAC Circuit
simulation and design tool, any RF or
analog circuit can be easily simulated with
a wide range of analysis methods.
Moreover, optimization, tuning and a
Monte Carlo statistical feature (for design
yield) are available with every analysis
methods. Through APLAC it is possible to
easily simulate miniaturized structures and
complex system. Device models developed
for large devices are inapplicable when
nano-scale physical phenomena enter into
play.
Simulation Results
Firstly simulated a ZnO FBAR structure in
Aplac8.1 version. The FBAR is having lay
an upper and bottom electrode of Au and a
membrane layer of Si3N4 for support.
Then calculated the resonance frequency
analytically and then analyzed the
simulated result which is approximately
the same.
Simulation of ZnO FBAR Here used the one dimensional Mason
Model and the basic transmission line
theory to simulate the FBAR which has
ZnO as the piezoelectric material and the
Au as the top and bottom electrodes and
for the membrane we used Si3 N4 as the
material. The circuit diagram is shown in
Fig.5.For the top and bottom electrode and
membrane layer used the transmission line
model. But for the piezoelectric layer the
one dimensional Mason Model used. The
results of simulation are shown in Fig.6
and Fig.7. Fig.7 shows S21 (both
magnitude and phase) and Fig.6 shows S11
(both magnitude and phase). If we analyze
Fig.7 we can easily see the resonance at
the expected frequency.
Obtained and taken values in the
simulation are given in table.1:
Table.1
Ar
ea
(F
B
A
R)
Thi
ck
nes
s
of
Zn
O
S
21
Mi
n
S
11
Mi
n
fp fs k
eff2
Q F
O
M
45
u
m2
1.2
um
-
6
1
d
B
-
0.
3
d
B
2.5
93
GH
z
2.6
21
GH
z
0.
0
2
6
1
5
0
0
0
3
9
0
Figure.5 Circuit Simulated in Aplac
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0503-5
Figure.6 .FBAR Resonator S (1, 1)
Figure.7 .FBAR Resonator S (2, 1)
Figure.8 Smith Chart showing S (2, 1) and
S (1,1)
It also analyzed the influence of different
piezoelectric films and electrode materials
on the characteristics of a thin film bulk
acoustic resonator (FBAR). The results
confirm that the material properties and
thicknesses of piezoelectric film play a
significant role in determining the
performance of FBAR, and influence such
characteristics such as Resonance
frequency, the bandwidth and the insertion
loss. Since the results demonstrate that the
thicknesses of each of the layers within the
acoustic wave path, and by the resonance
area, the potential exists to tune the
characteristics of the FBAR by specifying
appropriate geometric parameters during
the FBAR design stage.
Effect of using different piezoelectric
material:
For example using AlN as the piezoelectric
material then the resonance frequency
from around 2.62GHz to 4.7GHz will be
increased using the same area and
thickness for both cases. As depicted
below Q factor and FOM of ZnO FBAR is
more then AlN FBAR hence it is better
FBAR in terms of performance. The
comparisons are shown in table.2. The
results of simulation are shown in Fig.9
and Fig.10 and fig.11. Fig.9 shows S21
(both magnitude and phase) and Fig.10
shows S11 (both magnitude and phase). If
we analyze Fig.9 we can easily see the
resonance at the expected frequency.
Table.2
1.500G 1.875G 2.250G 2.625G 3.000G
-0.50
-0.13
0.25
0.63
1.00
-180.00
-90.00
0.00
90.00
180.00
ZnO FBAR Area 45usq.m d=1.2um
APLAC 8.10 Student version FOR NON-COMMERCIAL USE ONLY
dB
f/Hz
PHASE
MagdB(S(1,1)) Pha(S(1,1))
1.500G 1.875G 2.250G 2.625G 3.000G
-63.00
-54.25
-45.50
-36.75
-28.00
-180.00
-90.00
0.00
90.00
180.00
ZnO FBAR Area 45usq.m d=1.2um
APLAC 8.10 Student version FOR NON-COMMERCIAL USE ONLY
dB
f/Hz
PHASE
MagdB(S(2,1)) Pha(S(2,1))
0.5
-0.5
2.0
-2.0
0.0 0.2 1.0 5.0
ZnO FBAR Area 45usq.m d=1.2um
APLAC 8.10 Student version FOR NON-COMMERCIAL USE ONLY
Im(S(1,1)) Im(S(2,1))
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0503-6
Figure.9 AlN FBAR Resonator S21
Figure.10 AlN FBAR Resonator S11
Figure.11 Smith Chart showing S (2, 1)
and S(1,1)
Conclusion
Result shows that the resonant frequency
of the FBAR depends upon the particular
choice of the piezoelectric material. It also
demonstrated that the FBAR performance
is influenced by the physical dimensions
of the device, including the thickness of
the piezoelectric film, electrode,
membrane layer, and by the resonance area
size. It is possible to calculate the effective
coupling coeffient, Q factor and figure of
merit. In this way it is possible to specify
suitable parameter values, which will
optimize the design of the FBAR, and
which can be used in designing FBAR
devices that will operate within a specified
frequency range.
Refrences
(1)K.M Lakin and G.R Kline and K.T
MCArron,” High –Q microwave acoustic
resonators and filters,” IEEE transactions
microwave theory and techniques,vol.41.
(2) S.V Krishnaswamy , J. Rosenbaum ,S.
Horwitz ,C.Vale and R.A. Moore ,” Film
Bulk acoustic wave resonator technology
,” Proceedings of the IEEE ultrasonic
Symposium, Honolulu, HI, USA, 1990.
(3)P.J Yoon GW,” Fabrication of ZnO-
based film bulk acoustic resonator devices
using W/SiO2 multilayer reflector,”
Electronics letters, vol.36 (16).
(4)K.M.Lakin and J.S. Wang,”UHF
composite bulk wave resonator” Ultrasonic
Symposium ,1990.
(5)W.P Mason, Physical Acoustic
Principles and Methods, Vol.1A,
Academic press, New York.
(6) G. G. Fattinger, J. Kaitila, R. Aigner,
W. Nessler,” Single-to-balanced Filters for
Mobile Phones using Coupled Resonator
BAW Technology”,IEEE International
Ultrasonics, Ferroelectrics and Frequency
Control Symposium, 2004.
(7)K. M. Lakin, “Thin film resonator
technologies”, IEEE Trans. UFFC,vol.52,
pp. 707-716, May 2005.
(8)F. Constantinescu. M. Nitescu, A. G.
Gheorghe, “New circuit models for power
BAW resonators “, in Proc. .ICCSC
Shanghai, China, pp.176-179,2008.
3.500G 3.875G 4.250G 4.625G 5.000G
-60.00
-49.50
-39.00
-28.50
-18.00
-180.00
-90.00
0.00
90.00
180.00
AlN FBAR Area=45um d=1.2um
APLAC 8.10 Student version FOR NON-COMMERCIAL USE ONLY
dB
f/Hz
PHASE
MagdB(S(2,1)) Pha(S(2,1))
3.500G 3.875G 4.250G 4.625G 5.000G
-0.80
-0.48
-0.15
0.18
0.50
-180.00
-90.00
0.00
90.00
180.00
AlN FBAR Area=45um d=1.2um
APLAC 8.10 Student version FOR NON-COMMERCIAL USE ONLY
dB
f/Hz
PHASE
MagdB(S(1,1)) Pha(S(1,1))
0.5
-0.5
2.0
-2.0
0.0 0.2 1.0 5.0
AlN FBAR Area=45um d=1.2um
APLAC 8.10 Student version FOR NON-COMMERCIAL USE ONLY
Im(S(1,1)) Im(S(2,1))
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0506-1
Role of Speech Scrambling and Encryption in Secure Voice Communication
Himanshu Gupta Faculty Member, Amity Institute of Information
Technology, Amity University Campus,
Sector – 125, Noida (Uttar Pradesh), India.
E-mail: [email protected]
Prof. (Dr.) Vinod Kumar Sharma
Professor & Dean, Faculty of Technology,
Gurukula Kangri Vishwavidyalaya,
Haidwar, India
E-mail: [email protected]
Abstract— Security of speech is a challenging
issue of voice communications today that requires
speech scrambling and encryption techniques. The
rapid development in information technology, the
demand of secure transmission of voice over
wireless communication channel is increasing day
by day. The conventional methods of voice
communication can’t provide adequate security
from intruder. The voice data may be accessed by
the unauthorized user for malicious purpose.
Therefore, it is necessary to apply effective
scrambling and encryption techniques to enhance
voice security. The speech scrambling and
encryption technique can provide sufficient
security over wireless media. In this research
paper, various effective speech scrambling and
encryption techniques are proposed. In this
scrambling and encryption technique, original
speech is inverted and encrypted with different
strong scrambling and encryption methods. This
scrambling and encryption technique enhances the
security of voice over insecure communication
channel at large extent.
Keywords-Speech Scrambling; Speech
Encryption ; Secure Voicey; Communication
Channel..
I. INTRODUCTION
A secure voice communication is a process that
allows for the secure transmission of voice
communications between a sending and a
receiving node over wireless communication
channel. This process uses various scrambling and
encryption techniques which are capable of
inversion and encryption of speech in effective
manner.
When two entities are communicating with each
other, and they do not want a third party to listen
to their communication, then they want to pass on
their message in such a way that nobody else
could understand their message. This is known as
communicating in a secure manner or Secure
Communication.
Secure communication includes means by which
people can share information with varying
degrees of certainty that third parties cannot know
what was said. Other than communication spoken
face to face out of possibility of listening, it is
probably safe to say that no communication is
guaranteed secure in this sense, although practical
limitations such as legislation, resources, technical
issues such as interception, and the sheer volume
of communication are limiting factors to
surveillance.
II. BACKGROUND
The implementation of voice encryption dates
back to World War II when secure
communication was paramount to the US armed
forces. During that time, noise was simply added
to a voice signal to prevent enemies from listening
to the conversations. Noise was added by playing
a record of noise in synch with the voice signal
and when the voice signal reached the receiver,
the noise signal was subtracted out, leaving the
original voice signal. In order to subtract out the
noise, the receiver need to have the exact same
noise signal and the noise records were only made
in pairs; one for the transmitter and one for the
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0506-2
receiver. Having only two copies of records made
it impossible for the wrong receiver to decrypt the
signal. To implement the system, the army
contracted Bell Laboratories and they developed a
system called SIGSALY. With SIGSALY, ten
channels were used to sample the frequency
spectrum from 250 Hz to 3 kHz and two channels
were allocated to sample voice pitch and
background hiss. In the time of SIGSALY, the
transistor had not been developed and the digital
sampling was done by circuits using the model
2051 Thyratron vacuum tube. Each SIGSALY
terminal used 40 racks of equipment weighing 55
tons and filled a large room. This equipment
included radio transmitters and receivers and large
phonograph turntables. The voice was keyed to
two 16-inch vinyl phonograph records that
contained a Frequency Shift Keying (FSK) audio
tone. The records were played on large precise
turntables in synch with the voice transmission[1].
From the introduction of voice encryption to
today, encryption techniques have evolved
drastically. Digital technology has effectively
replaced old analog methods of voice encryption
and by using complex algorithms; voice
encryption has become much more secure and
efficient. One relatively modern voice encryption
method is Sub-band coding. With Sub-band
Coding, the voice signal is split into multiple
frequency bands, using multiple bandpass filters
that cover specific frequency ranges of interest.
The output signals from the bandpass filters are
then lowpass translated to reduce the bandwidth,
which reduces the sampling rate. The lowpass
signals are then quantized and encoded using
special techniques like, Pulse Code
Modulation (PCM). After the encoding stage, the
signals are multiplexed and sent out along the
communication network. When the signal reaches
the receiver, the inverse operations are applied to
the signal to get it back to its original state.
Motorola developed a voice encryption system
called Digital Voice Protection (DVP) as part of
their first generation of voice encryption
techniques. "DVP uses a self-synchronizing
encryption technique known as cipher feedback
(CFB). The basic DVP algorithm is capable of
2.36 x 1021
different "keys" based on a key length
of 32 bits." The extremely high amount of
possible keys associated with the early DVP
algorithm, makes the algorithm very robust and
gives the user a high level of security. As with any
voice encryption system, the encryption key is
required to decrypt the signal with a special
decryption algorithm[2].
III. OVERVIEW OF THE PROPOSED SPEECH
SCRAMBLING TECHNIQUE
Speech inversion is a very common method of
speech scrambling, probably because its the
cheapest. Speech inversion works be taking a
signal and turning it 'inside out', reversing the
signal around a pre-set frequency. Speech
inversion can be broken down into three types,
base-band inversion (also called 'phase
inversion'), variable-band inversion (or 'rolling
phase inversion') and split band inversion. Images
will be used to help clarify what different
inversion systems do.
Fig 1: The non-scrambled sound wave
Base band inversion inverts the signal around
a pre-set frequency that never changes. Because
of this, base-band inversion is useless. Because
the inverting frequency never changes, running
the frequency through another inverter set on the
same frequency unscrambles it. Descrambling
baseband inversion is simple. Take the scrambled
input and re-invert it around the same inversion
point used to scramble it.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0506-3
Fig 2: Base Band Inversion of Sound wave
Variable-band inversion inverts the signal around
a constantly varying frequency, making
decryption possible, but not bloody likely.
Variable band inversion can be identified by the
burst of modem noise at the beginning of the
transmission (it’s a 1200 bps carrier) and the
repeated clicking sounds as the inverting
frequency changes. Descrambling variable band
inversion would be a chore for the amateur
eavesdropper, as the inversion point changes
every fraction of a second. Professionals however
would likely have little trouble extracting clear
speech.
Split-band inversion is another method for making
inversion more secure. Split band inversion
divides the signal into two frequencies and inverts
them (usually baseband) separately. Some split
band inversion systems provide enhanced security
by randomly changing the frequency where the
signal is split at given intervals.
Fig 3: Split Base Band Inversion of Sound Wave
IV. OVERVIEW OF THE PROPOSED SPEECH
ENCRYPTION TECHNIQUE
Encryption is a much stronger method of
protecting speech communications than any form
of scrambling. Voice encryptions work by
digitizing the conversation at the telephone and
applying a cryptographic technique to the
resulting bit-stream. In order to decrypt the
speech, the correct encryption method and key
must be used [3]. For Speech or Voice encryption,
we can use any one of the following encryption
methods.
(A) Hardware Based Encryption Systems
Hard encryption systems are voice encryption
schemes that utilize hardware to encrypt
conversations. Hard encryption devices are useful
because they don't need a computer to work
(allowing them to be built into things like radios
and cellular phones), are usually more secure, and
are simpler to use. On the downside, hardware
encryption systems are very expensive and can be
hard to acquire.
(B) Software Based Encryption Systems
Soft encryption systems are exactly what they
sound like, software based encryption. While the
inconvenience of having to use a computer is the
primary drawback to soft voice encryption, most
of the available programs use good crypto and are
free.
(C) Digital Voice Protection
Digital Voice Protection (DVP) is a proprietary
speech encryption technique used by Motorola for
their higher-end secure communications products.
DVP is considered to be very secure.
(D) PGPFone
PGPfone is another offering from Pretty Good
Privacy Inc., a secure voice program for the PC.
The interface is pleasantly intuitive, and there are
options for different encoders and decoders (for
either cellphone or landline use). PGPfone offers
a selection of encryption schemes: 128 bit CAST
key (a DES-like crypto system), 168 bit Triple-
DES key (estimated key strength is 112 bits) or
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0506-4
192 bit Blowfish key (unknown estimated key
strength).
(E) Nautilus Nautilus is a free secure communications
program. Its lacks many of the features of other
communications programs, and its interface is
best described as user-hateful. Unlike most other
voice encryption programs, Nautilus uses a
proprietary algorithm with a key negotiated by the
Diffie-Hellman Key Exchange.
(F) Speak Freely
Speak Freely is a versatile, simple voice
encryption system. Speak Freely offers a selection
of voice encryption techniques (IDEA or DES).
Speak Freely also permits conferencing, and
contains several other useful functions. Unlike
most voice encryption platforms, Speak Freely
includes options that it to connect to other
encrypting and non-encrypting internet
telephones.
(G) SEU-8201 Cipher system
The SEU-8201 is a high-security voice ciphering
system which is mainly used for authorities,
governmental agencies, police and military or
paramilitary. The ciphering algorithm is a new
approach, providing the highest security needed
for such user groups. From a practical standpoint,
it is not susceptible to attack by eavesdroppers or
by using current crypto-analytical methods [4].
.
Fig 4: SEU-8201 Voice Encryption System
V. CONCLUSION
The Speech Scrambling and Encryption is an ambivalent technique for voice security and plays an important role in the field of voice communication. Speech Scrambling and Encryption technique describes the enhanced security of voice communication due to large number of complex operations to convert the sound wave from original one to scramble wave format, which is very difficult to convert into original format for any unauthorized third party. The advantage of Speech Scrambling and Encryption is that it provides better security because even if transmitted wave is accessed by the intruder, the confidentiality of original wave can still be maintained by the speech scramble and encryption technique. The study of speech scrambling and encryption technique aims to enhance the potential of upcoming communication technologies and its implications to defense and government users. The implementation of voice scrambling and encryption technique is a strong and positive move in the way of defining a standard for secure voice communication. However, as the amount of confidential voice communication increases over the insecure wireless channel, speech scrambling and encryption must also be reviewed from a security prospective.
VI. REFERENCES
1. Weblink:http://history.sandiego.edu/gen/re
cording/sigsaly.html, “SIGSALY”
2. Owens, F. J. (1993). Signal Processing of Speech. Houndmills: MacMillan Press. ISBN 0333519221.
3. Weblink:http://seussbeta.tripod.com/crypt.
html#SCRAMBLE
4. Accessed e-Link: http://vhf-encryption.at-communication.com/ en/secure/seu_8201.html
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0506-5