SIP

1

MEMS Based MicroResonator Design & Simulation Based On

Comb-Drive Structure Mr. Prashant Gupta

[email protected] Ideal Institute of Technology, Ghaziabad

Abstract:- Resonators serve as essential components in Radio- Frequency (RF) electronics, forming the backbone of filters and tuned amplifiers. However, traditional solid state or mechanic implementations of resonators and filters tend to be bulky and power hungry, limiting the versatility of communications, guidance, and avionics systems. MicroElectro-Mechanical Systems (MEMS) are promising replacements for traditional RFcircuit components. In this paper we discuss the MEMS resonator, which is one of the versatile components in the RF circuits, based on one of the promising architecture known as Comb-Drive structure. Introduction: A resonator is a device or system that exhibits resonance or resonant behavior, that is, it naturally oscillates at some frequencies, called its resonant frequencies, with greater amplitude than at others. The oscillations in a resonator can be either electromagnetic or mechanical (including acoustic). Resonators are used to either generate waves of specific frequencies or to select specific frequencies from a signal. A physical system can have as many resonant frequencies as it has degrees of freedom; each degree of freedom can vibrate as a harmonic oscillator. Systems with one degree of freedom, such as a mass on a spring, pendulums, balance wheels, and LC tuned circuits have one resonant frequency. Systems with two degrees of freedom, such as coupled pendulums and resonant transformers can have two resonant frequencies. The vibrations in them begin to travel through the coupled harmonic oscillators in waves, from one oscillator to the next. Resonators can be viewed as being made of millions of coupled moving parts (such as atoms). Therefore they can have millions

of resonant frequencies, although only a few may be used in practical resonators. The vibrations inside them travel as waves, at an approximately constant velocity, bouncing back and forth between the sides of the resonator. The oppositely moving waves interfere with each other to create a pattern of standing waves in the resonator. If the distance between the sides is , the length of a round trip is . In order to cause resonance, the phase of a sinusoidal wave after a round trip has to be equal to the initial phase, so the waves will reinforce. So the condition for resonance in a resonator is that the round trip distance, , be equal to an integral number of wavelengths of the wave:

If the velocity of a wave is , the frequency is so the resonance frequencies are:

So the resonant frequencies of resonators, called normal modes, are equally spaced multiples (harmonics), of a lowest frequency called the fundamental frequency. The above analysis assumes the medium inside the resonator is homogeneous, so the waves travel at a constant speed, and that the shape of the resonator is rectilinear. If the resonator is inhomogeneous or has a non rectilinear shape, like a circular drumhead or a cylindrical microwave cavity, the resonant frequencies may not occur at equally spaced multiples of the fundamental frequency. They are then called overtones instead of harmonics. There may be several such series of resonant frequencies in a single resonator, corresponding to different modes of vibration.

2

MEMS Resonators:- Mechanical resonators are highly sensitive probes for physical or chemical parameters which alter their potential or kinetic energy[1,2]. Silicon resonant microsensors for measurement of pressure, acceleration, and vapor concentration have been demonstrated recently, polysilicon micro-mechanical structures have been resonated elcctrostatlcally parallel to the plane of the substrate by means of one or more interdigitated capacitors (electrostatic combs). Some advantages of this approach are (1) less damping on the structure, leading to higher quality factors, (2) linearity of the electrostatic-comb drive and (3) flexibility in the design of the suspension for the resonator For example, folded-beam suspensions can be fabricated without increased process complexity, which is attractive for releasing residual strain and for achieving large-amplitude vibrations. There are different types of resonator. We only focus on vibrating resonators. •Lateral movement –Parallel to substrate –Ex.: Folded beam comb-structure •Vertical movement –Perpendicular to substrate –Ex.: clamped-clamped beam (c-c beam) –”free-free beam”(f-f beam) Example of simple resonators Mass and spring. This resonator is used by many physicists as the elemental simple mechanical resonator, to explain the properties of more complex resonances and resonators. The governing homogeneous differential equation is

vertical displacement y from its equilibrium position, mass m and spring constant k = f / y, R is the damping coefficient. The angular resonant frequency is given by

Folded-Flexure comb drive Microresonator:- In the design of Resonator, spring constant played a vital role. Different types of spring designs have been applied in comb-drive actuators. 1- Clamped–clamped beams, 2-A crab-leg flexure and 3- A folded-beam flexure. In all these different types of spring design, folded beam structure is widely used to design a Microresonator The folded-flexure electrostatic comb drive micromechanical resonator shown in Figure 1 was first introduced by Tang [4, 5,6]. This device has been well-researched and is commonly used for MEMS process characterization. The microresonator consists of a movable central shuttle mass which is suspended by folded-flexure springs on either side. The other ends of the folded-flexure springs are fixed to the lower layer. The microresonator can be thought of, as a spring-mass damper system, the damping being provided by the air below and above the movable part. By applying a voltage across the fixed and movable comb fingers, an electrostatic force is produced which sets the mass into motion in the x-direction. The microresonator has been used in building filters, oscillators and in resonant positioning systems. Figure 1 shows the overhead view of a µresonator which utilizes interdigitated-comb finger transduction in a typical bias and excitation configuration. The resonator consists of a finger-supporting shuttle mass suspended above the substrate by folded flexures, which are anchored to the substrate at two central points. The shuttle mass is free to move in the direction

3

indicated, parallel to the plane of the silicon substrate. Folding the suspending beams as shown provides two main advantages: first, post-fabrication residual stress is relieved if all beams expand or contract by the same amount; and second, spring stiffening nonlinearity in the suspension is reduced, since the folding truss is free to move in a direction perpendicular to the resonator motion. The black areas are the places where the polysilicon structure is anchored to the bottom layer.

Fig.1 Layout of the lateral folded-flexure comb drive microresonator Modeling the Oscillation Modes of the Microresonator:- The preferred direction of motion of the microresonator is the x-direction. However, the microresonator structure can vibrate in other modes. There are the three translation modes along x, y and z, three rotational modes about x, y and z, and oscillation modes due the movement of the folded-flexure beams and the comb drive. Each oscillation mode is described by a lumped second-order equation of motion. For any generalized displacement ζ, we can write:

where Fe,ζ is the external force (in the x-mode this force is generated by the comb drives), rn; is theeffective mass, Bζ is the damping coefficient, and k; is the spring constant. The fundamental frequency of the structure can be obtained from Rayleigh’s Quotient. The fundamental resonance frequency of this mechanical resonator is, again, determined largely by material properties and by geometry, and is given by the expression .

where MP is the shuttle mass, Mt is the mass of the folding trusses, Mb is the total mass of the suspending beams, W and h are the cross-sectional width and thickness, respectively, of the suspending beams, and L is indicated in Fig.1 The expression for the damping coefficient is

where µ is the viscosity of air, d is the fixed spacer gap between the ground plane and the bottom surface of the comb fingers, δ is the penetration depth of airflow above the structure, g is the gap between comb fingers, and As, At, Ab, and Ac are layout areas for the shuttle, truss beams, flexure beams, and comb finger sidewalls, respectively.

4

Working Principle:- To bias and excite the device, a dc-bias voltage VP is applied to the resonator and its underlying ground plane, while an ac excitation voltage is applied to one (or more) drive electrodes. A specific resonance mode may be emphasized by using multiple drive electrodes, placing them at the displacement maxima of the desired mode, and applying properly phased drive signals to the electrodes. To avoid unnecessary notational complexity, however, we focus on the case of fundamental-mode resonance in the present discussion. We also assume that the electrodes are concentrated at the center of the beam and that the beam length is much greater than the electrode lengths. This allows us to neglect beam displacement variations across the lengths of the electrodes due to the beam’s mode shape (i.e., we may assume that x(y) ~ x for y near the center of the beam). A more rigorous analysis which accounts for all of these effects is certainly possible, but obscures the main points. When an ac excitation with frequency close to the fundamental resonance frequency of the µresonator is applied, the µresonator begins to oscillate, creating a time-varying capacitance between the µresonator and the electrodes. Since the dc-bias VPn = VP - Vn is effectively applied across the time-varying capacitance at port n, a motional output current arises at port n. For this resonator design, the transducer capacitors consist of overlap capacitance between the interdigitated shuttle and electrode fingers. As the shuttle moves, these capacitors vary linearly with displacement. Thus, Cn/x is a constant, given approximately by the expression

where Ng is the number of finger gaps, h is the film thickness, and d is the gap between electrode

and resonator fingers. α is a constant that models additional capacitance due to fringing electricfields. For comb geometries, α =1.2 . Note that, again, Cn/x is inversely proportional to the gap distance. Linear equations for the spring constants are derived using energy methods . A force (or moment) is applied to the free end(s) of the spring in the direction of interest, and the displacement is calculated symbolically (as a function of the design variables and the applied force). In these calculations different boundary conditions are applied for the different modes of deformation of the spring. When forces (moments) are applied at the end-points of the flexure, the total energy of deformation, U, is calculated as:

where, Li is the length of the i’th beam in the flexure, Mi is the bending momentransmitted through beam i, E is the Young’s modulus of the material of the beam (polysilicon, in our case) and Ii is the moment of inertia of beam i, about the relevant axis, Ti is the torsion transmitted through beam i, G is the shear modulus, Ji is the torsion constant of beam i, and ξ is the variable along the length of the beam. The bending moment and the torsion is a linear function of the forces and moments applied to the end-points of the flexure. The displacement of an end-point of the flexure in any direction ζ is given as:

where, F ζ is the force applied in that direction at that end-point . Similarly, angular displacements can be related to applied moments. Our aim here is to obtain the displacement in the direction of interest as a function of the applied force in that direction. Applying the boundary conditions, we obtain a set of linear

5

equations in terms of the applied forces and moments and the unknown displacement. Solving the set of equations yields a linear relationship between the displacement and applied force in the direction of interest. The constant of proportionality gives the spring constant as a function of the physical dimensions of the flexure. The effect of spring mass on resonance frequency is incorporated in effective masses for each lateral mode. Effective mass for each mode of interest is calculated by normalizing the total maximum kinetic energy of the spring by the maximum shuttle velocity, Vmax.

where mi and Li are the mass and length of the i’th beam in the flexure. Analytic expressions for velocities, vi, along the flexure’s beams are approximated from static deformation shapes, and are found from the spring constant derivations. Design Variables:- Fifteen design variables are identified for the µresonator. The design variables are listed in Table I and shown in Fig.2 These include 13 geometrical parameters (shown in Fig. 2), the number of fingers in the comb drive, N, and the effective voltage, V, applied to the combdrive.

Fig.2 Dimensions of the microresonator elements. (a) shuttle mass, (b) folded-flexure, comb drive with N movable ’rotor’ fingers, (d) close-up view of comb fingers.

The displacement as a function of the driving voltage was measured while applying a dc voltage between the rotor (movable set) and a stator (stationary set)

Table 1: Design and style variables for the microresonator. Upper and lower bounds are in units of µm except N and V. Quality Factor (Q):- It describes how underdamped an oscillator or resonator. Higher Q indicates a lower rate of energy loss relative to the stored energy.

Where x- x direction m-Mass k-Spring constant B- Damping coefficient.

6

Simulation Process:- Steps for the IntelliSuite Simulator 1-Design the appropriate mask or masks for your design in the IntelliMask 2- Fabricate the device using IntelliFab and visualize it. 3- Perform Different types of Analysis (Static or Frequency) with the help of TEM. 4- Get the results

Fig.3:MEMS microresonator mask structure using IntelliMask

Fig.4:MEMS microresonator process flow using IntelliFab

Fig.5:MEMS microresonator TEM structure using TEM Analysis

7

Fig.6:MEMS microresonator Pressure Distribution

Fig.7:MEMS microresonator charge Distribution

*Capacitance Report Number of conductors: 2 CAPACITANCE MATRIX, 1e-6 nanofarads*1e-6 C11 9.334000 C12 -1.037000 C21 -1.037000 C22 2.767000 *Natural Frequency Report *Unit Hz *Mode Number 6 Mode 1 Frequency 23347.1 (Natural Frequency or resonant Frequency) Mode 2 Frequency 39248.8 Mode 3 Frequency 40138 Mode 4 Frequency 51.6151 Mode 5 Frequency 70.8529 Resonator Simulation Results:- With the help of Simulation process we get the Resonant Frequency with different parameters. We can also find out displacement, pressure distribution, charge distribution, stress, linear motion etc. Figures for pressure distribution and charge distribution are shown in the figure. Comb characteristics Resonant frequency (kHz) S.No. No. Of

Fingers Finger Length (µm)

Finger Width (µm)

Gap (µm)

Calculated Measured

1 12 20 2 2 23.4 22.8 2 12 30 2 2 22.6 22.1 3 12 40 2 2 21.9 22 4 12 50 2 2 21.3 21.2 5 12 40 3 2 20.4 20.3 6 12 40 4 2 19.1 19.1 Table 2: Calculated and measured resonant frequencies of a set of combdrive structures

8

Conclusion and Future Work:- In this project we design and simulate a microresonator based on comb-drive structure which is introduced by Tang. We design it and calculate resonance frequency for different geometry parameters. There are two types of constraints in comb drive structure (1-Geometric and 2-Functional) which we have not discuss here left for the future work. The project work can be extended in a number of directions. Manufacturing variations need to be incorporated for accurate synthesis results. Fabrication for MEMS resonator is also a big issue which we are not discuss in our work and left for the future work. The spring constant can also be designed by different styles also left for future work. After design and calculating the resonance frequency for different shapes we go for simulation process and simulate them and get the results which we shown in the table. From all these work, I would like to conclude some points which are following. To achieve high resonance frequency –Total spring constant should increase –Or dynamic mass should decrease -(Difficult, since a given number of fingers are needed for electrostatic actuation –k and m depend on material choice, layout, dimensions •k expresses the spring constant relative to mass –Frequency can increase by using a material with larger k ratio than Si

Acknowledgements: This research work had been carrying out at CARE, IIT Delhi under the supervision of Prof. Sudhir Chandra CARE, IIT Delhi. I am also grateful to my college Director Dr. G. P. Govil and my Head of the Department Mr. N.P. Gupta for his kind hearted support and motivation during the research work. References:

1. S. M. Sze, Semiconductor Sensors, John Wiley & Sons Inc., New York, 1994

2. Ljubisa Ristic, “Sensor Technology and Devices”, Artech House ISBN 0-89006-532-2, 1994

3. G.K. Fedder and T. Mukherjee, "Automated Optimal Synthesis of Microresonators," Proc 9th Intl. Conf on Solid-State Sensors and Actuators (Transducers ’97), Chicago, IL, June 16-19, 1997.

4. W.C. Tang, T.-C. H. Nguyen, M. W. Judy, and R. T. Howe, "Electrostatic Comb Drive of Lateral Polysilicon Resonators," Sensors and Actuators A, 21 (1990) 328-31.

5. X. Zhang and W. C. Tang, "Viscous Air Damping in Laterally Driven Microresonators," Sensors and Materials, v. 7, no. 6, 1995, pp.415-430.

6. W C Tang, T-C H Nguyen and R T Howe, Laterally driven polysilicon resonant microstructures, IEEE MicroElectro Mechamal System Workshop, Salt Luke City, UT,US A , Feb 20-22, 1989, pp 53-59

7. C.T.C. Nguyen, MTT-S 1999 (http://www.eecs.umich.edu/~ctnguyen/mtt99.pdf)

8. Andrew Potter, “Fabrication and Modeling of Piezoelectric RF MEMS Resonators”, Department of Physics and Division Engineering – Brown University

9. Roger T. Howe, “Applications of Silicon Micromaching to Resonator Fabrication”, 1994 IEEE International Frequency Control Symposium

10. Clark T. C. Nguyen, “ Frequency-Selective MEMS for Miniaturized Communication Devices”, 1998 IEEE Aerospace Conference, vol 1 ,Snowmass, Colorado

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

SIP0103-4

Different Look-Ahead Algorithm for Pipelined

Implementation of Recursive Digital Filters Krishna Raj, VivekanandYadav

Abstract—Look-ahead techniques can pipeline IIR digital filters

to attain high sampling rates. The existing Look-ahead such as

CLA and SLA scheme were special cases of the proposed DLA

scheme (new LA scheme) for pipelined implementation of

recursive Digital filters. It can also used to provide equivalent

and stable pipelined implementation with reduced pipeline delay

and hardware when compared with existing Look-Ahead

schemes, comparison between DLA and SLA scheme.

Index Terms-- Clustered look-ahead (CLA) scattered look-ahead

(SLA), Look-Ahead (LA) and Distributed Look-Ahead (DLA).

I INTRODUCTION

ook-ahead techniques have been highly effective in

attaining high sampling rate and computation speed for

low-cost VLSI implementation of recursive digital filters [1,

4, 6, 7, and 9]. There are several LA approaches. One is

referred to as CLA algorithm or time domain approach [4, 6,

7], which clustered the past output data to achieve pipelined

IIR filters. CLA cannot guarantee to be stable, SLA algorithm

or the z-domain approach [1, 8], which uses equally separated

past output data and yields to stable pipelined IIR filters with

linear increasing in hardware. Now, distributed look-ahead

(DLA) algorithm is combines [2] the two above schemes to

reach stable design with reduced Pipeline delay and hardware

complexity.

An M-stage LA pipelined recursive filter can be obtained

by multiplying the numerator and the denominator of the

transfer function by an augmented polynomial, D (z). By

choosing proper order and coefficients of D (z), we obtain

either the M-stage CLA pipelined filter or the M-stage SLA

pipelined filter.

II EXISTING LOOK-AHEAD ALGORITHMS

The transfer function of Nth-order recursive filter is

described by

H (z) = = (1)

The LA algorithm finds the augmented polynomial D (z)

where

Krishna Raj is Deptt. of Electronics Engg., HBTI, Kanpur-208002,

India, Email: [email protected],

Vivekanand yadav, M.Tech., from Deptt. of Electronics Engg.,

HBTI, Kanpur, Email: [email protected].

D (z) = = 1 + (2)

Then the pipelined filter is attained by multiplying D (z) to

both the denominator and numerator in H (z) [10].

= = (3)

For different LA algorithms, the pipelined IIR filter transfer

functions (z) are in different forms. Three existing LA

algorithms are summarized here.

A. Clustered Look-Ahead Algorithm

For the M-stage CLA pipelined IIR filters, the denominator

of the transfer function can be expressed in the form of

(4)

Where M is the pipelined stage, is the coefficient of

pipelined filter. The output data y (n) can be described by the

cluster of N past data y (n-M), y (n-M-1), _, and y (n-M-N+1).

[2]The augmented polynomial coefficients can be found by

iterative calculating as follows...

(5)

Then M-stage pipelining of order recursive filter is

obtained as (4), (6), (7), and (9).

H (z) =

(6)

The total multiplication complexity is (2N+M) and latch

complexity is linear in M. extra delay in producing output is

M [11].

B. SCATTERED LOOK-AHEAD ALGORITHM

For the M-stage SLA pipelined IIR filters, the denominator

of the transfer function is obtained as

(7)

L


SIP0103-4

The denominator of the resulted transfer function contains

N scattered terms , …, .[3]The coefficients

can be obtained by solve N (M-1) simultaneous equation.

- , where, i=2,…, M-

1,M+1,..,tM-1, tM+1,…..,NM-1.

Then an equivalent M-stage pipelining of same order

recursive filter can be obtained as [1, 8].

H (z)

= (8)

The total multiplication complexity is (NM+N+1) and

latch complexity is square in M. The extra delay in producing

output is (NM-N) [11]. If M is power of 2, then using

decomposition technique, the total multiplication and latch

complexity can be further reduced [1].The architecture is

shown Fig. 1(b).

C. Distributed Look Ahead Pipelining

Pipelining of the following filter transfer function

H (z) =

Since must equal original H (z), can also be obtained

by multiplying. The original filter by an augmentation

polynomial D (z) both in the numerator and the denominator,

i.e.,

Where D (z) = 1+ …………. +

Initialize = -

Iterate For i=2 to (M-1)

According to the Distributed Look-Ahead (DLA)

transformation, the M-stage pipelined filter transfer function

would have the following general form.

(9)

The coefficient of non-recursive portion of pipelined filter

are unequally distributed and it can be implemented with

( ) multiplication and recursive portion by

(L+1) multiplications, hence total multiplications (

and latch complexity is linear in M.CLA and

SLA scheme are special class of DLA scheme. An M-stage

pipelined version of an order Recursive filter is obtained

by substituting in (1) [2, 4, 6, and 7]. Similarly, an M-

stage SLA pipelined version of same order recursive filter

can be produced by substituting in (1) [1, 2, 8].It is

used for high speed modular implementation of stable 2-D

denominator separable IIR filters.

In out In Out In out

(a) (b) (c)

Fig: (1) LA pipelined IIR filters (a) CLA realization (b) SLA realization (c)

DLA realization

III COMPARATIVE ANALYSIS

Table-1

Pipelining

Methods

Multiplication

Complexity

Delay in

First

output

Extra

Delay in

output

CLA L+M+N-1 M M

SLA NM+L NM NM-N

DLA

M+ M

Table-2

IV CONCLUSIONS

The denominator order using DLA , (M + ) is less than the

order with SLA (NM), and the DLA transformed filter is

stable, and then the proposed scheme would offer considerable

hardware savings over SLA. Multiplication and Latch

complexity are less over SLA. Pipeline Delay and hardware

M=3 M=4 M=6 M=8

Method SLA A SLA DLA SLA DLA SLA DLA

No. of

MUL

/adder 6 5 6 5 8 6 8 7

No of Latch 10 8 14 10 22 14 30 18

Delay

in 1st o/p 6 5 8 6 12 8 16 10

MD

DD

D

M

D

D

D

D

M

D

M

D

M

D

M

D

D D

D


SIP0103-4

complexity are reduced than SLA. From below pole-zero plot

we see that if increase stages more stable the filter. We plot

the graph and we conclude that No. of Multiplier /Adder are

lesser in DLA than SLA because SLA is attained greater value

at each Stage than DLA. No. of Latches are also lesser in DLA

than SLA because in SLA attained values are very large

comparatively to DLA and similarly Delay producing in first

output is also lesser in DLA than SLA.

Examples

H (z) =

(a) (b)

(c) Fig:2 (d)

(a) Fig:3 (b)

(a) Fig:4 (b)

(b)

Fig: 5 (a) DLA (b) SLA

(Using table1 and tabe2)

V REFERENCES

[1] K. K. Parhi and D. G. Messerschmitt, “Pipeline interleaving and

parallelism in recursive digital filters-Part I: Pipelining using

scattered look-ahead and decomposition,” IEEE Trans .on Acoustics,

Speech, and Signal Processing, vol. 37,no. 7 pp. this issue, pp. 1099-

1117,july 1989.

[2] A. K. Shaw and M. Imtiaz, "A general Look-Ahead algorithm for

pipelining IIR filters," in Proc. IEEE ISCAS, 1996, pp. 237-240.

[3] Y. C. Lim, "A new approach for deriving scattered coefficients of

pipelined IIR filters," IEEE Trans. Signal Processing, vol. 43, pp.

2405-2406, 1995.

[4] H.H. Loomis and B Sinha, “High-speed Recursive Digital Filter

Realization”, Circuits, Systems and Signal Processing, vo1.3, pp.

267-294, Sept., 1984.

[5] A. P. Chand, “Low Power CMOS Digital Design,” IEEE J. of

Solid-State Circuits, vol. 27, pp. 473-484, Apr., 1992.

[6] P.M. Kogge, The architecture of Pipelined Computers, New

York, Hemisphere Publishing Corporation, 1981.

[7] Y.C. Lim and B. Liu, “Pipelined Recursive Filter with Minimum

Order Augmentation”, IEEE Transactions on Signal Processing,

vo1.40, no. 7, pp. 1643-1651, July 1992.

[8] M. A. Soderstrand, K. Chopper and B. Sinha, “Comparison of

three new techniques for pipelining IIR digital flters,”23rd

ASILOMAR Conjerenceon Signals, Systems & Computers, Pacific

Grove, CA, pp. 439-443, Nov., 1984.

[9] H. B. Voelcker and E:E. Hartquist, “Digital Filtering via Block

Recursion”, IEEE Trans. Audio Electroacoust., Vol.AU-18, pp.169-

176, June, 1970.

[10] Yen-Liang chen,Chun-Yu chen,Kai-Yuan Jheng and An-

Yen(Andy)Wu,”A Universal Look-Ahead Algorithm For Pipelining

IIR Filters”IEEE Trans,2008.

[11] A. K. Shaw and M. Imtiaz, "New Look-Ahead Algorithm for

Pipelined Implementation of Recursive Digital Filters,” in Proc.

IEEE ISCAS, 1996, pp. 3229-323.

Fig :( 2)

Pole-zero plots for

CLA (a) M=3 (b) M=4 (c) M=5(d) M=6[only (d) stable]

-1 -0.5 0 0.5 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

2

Real Part

Imag

inar

y P

art

-1 -0.5 0 0.5 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

2

Real Part

Imagin

ary

Part

-1 -0.5 0 0.5 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

2

Real Part

Imagin

ary

Part

-1 -0.5 0 0.5 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

2

Real Part

Imagin

ary

Part

-1 -0.5 0 0.5 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

2

Real Part

Imagin

ary

Part

-1 -0.5 0 0.5 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

2

Real Part

Imagin

ary

Part

-1 -0.5 0 0.5 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

3

Real Part

Imagin

ary

Part

-1 -0.5 0 0.5 1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

2

Real Part

Imagin

ary

Part


SIP0103-4

Fig :( 3) pole-zero plot for SLA (a) M=3 (b) M=4[both stable]

Fig :( 4) pole-zero plot for DLA (a) M=3 (b) M=4[both stable]

1

Study of MC-EZBC and H.264 Video Codec

Agha Asim1 Husain and Agha Imran2 Husain1Deptt of Electronics & Comm. Engg, ITS Engg College, 201301, India

2Deptt of Computer Science & Engg, MRCE, 121004, IndiaEmail: [email protected], [email protected]

Abstract: This paper proposes a new aspect of comparing thetwo video codecs on the basis of rate-distortion basis. Scalablecoding provides a straight forward solution for video codingthat can serve broad range of applications without the need fortranscoding. Even though the latest international video -codingstandards do not provide ful ly scalable methods, only H.264provides the best rate-distortion performance. Other thanH.264, the performance on rate-distortion Motion CompensatedEmbedded Zero Block Context (MC-EZBC) coder which is fullyscalable.

Keywords—, MC-EZBC, ME/MC sub pixel accuracy,temporal level subband coding, YSNR.

I. INTRODUCTION

THE MODERN VIDEO compression coding technologies hasbeen significantly improved for last few years and hasenabled broadcasting of digital video signal over variousnetworks [1]. Also motion compensated wavelet based videocoding emerged as an important research topic to explorebecause of its ability to provide better quality. MC-EZBC [2][3] is one of the codec that encodes the motion information ina non scalable manner, which results in a reduced codingefficiency performance at low bit rates. However H.264 [4] isa non scalable coding technique provides a good qualityvideo at substantially lower bit rates than previous standardslike MPEG-2, H.263, or MPEG-4 Part 2 without increasingthe complexity of design and cost.

In this paper we are performing the analysis on thejoint region of applicability between the MC-EZBC andH.264 video codec. In MC-EZBC, by using a third and fourlevel of temporal decomposition of the input video sequencethereby obtaining a GOP structure of 8 and 16 frames, andeffect of sub-pixel accurate Motion estimation andcompensation, a good comparison with H.264 is achieved interms of Coding Efficiency [5].

The outline of the paper is as follows. After introducingthe examined compression schemes in section II, an overviewof the applied methodology is provided in Section III. Th eobtained results are described in Section IV while theconclusions are drawn in Section V.

II. Video codec overviewThe two video codec that were used in the tests are summedup in this section. Due to place constraints, the reader isreferred to the references for further information on thesecodecs. The first one is a scalable wavelet based video codecdeveloped by J. Woods et al. (motion compensated

embedded zero blocking coding --- MC-EZBC) [6] [7]. Thesecond video codec is the Ad Hoc Model 2.0 (AH M 2.0)implementation of the H.264 standard [4][8] which extendsthe JM 6.1 implementation[9] with a rate controlalgorithm[10].

III. MATERIALS AND METHODS

A. Encoding Process

This section describes how the two codecs were configuredand used in order to obtain the bit streams necessary forperforming the various measurements.

TABLE ISequences Used In Our Experiment

Name No. of frames Abbreviation

Akiyo 300 AK

Foreman 300 FO

Hall 300 HA

As input three progressive video sequences were used inraw Y Cb Cr 4:2:0 formats. These were downloaded from theHannover FTP server.

An overview of the sequences is given i n the Table I. Theresolutions used are the Common Intermediate Format

(CIF, 288352 ), thus resulting in 3 input video sequences.These sequences were encoded by making use of constant bitrate coding (CBR). Ten different target bit rates were used:both very low and very high bit rates. The bit -rates taken are100, 200, 300,…1000 kbps. At each bit rate, encoding wasperformed at 30 frames per second. The detailed settings forthe different encoding parameters can be found in Table IIand Table III.

The code of MC-EZBC was downloaded from the MPEGCVS server. Each input video sequence was encoded onceand then pulled several times in order to get decodable bitstreams for all target bit rates. The H.264 bitstreams areconforming to Baseline and Main Profile. The GOP structureis IBBBP and GOP length is 16.

[email protected]

[email protected]

TABLE IIParameter Settings for the MC-EZBC Compressor

Parameter Value(CIF) Comment

-inname akiyo.yuv Name of input filecontaining a sequence of4:2:0

-statname akiyo_tpyrlev3_cif_mv0.stat

Name of output filecontaining some statisticalinformation generatedduring encoding

-start 0 Index number of the firstframe (0 means first framein file)

-last 299 Index number of the lastframe

-size 352 288 176144

Size of each input frame.1. pixel width of theluminance component2. pixel height of theluminance component3. pixel width of thechrominance component4. pixel height of thechrominance component

-frame rate 30 Number of input framesper second

-tPyrLev 3 Levels of temporalsubband decomposition

-searchrange 16 Maximum search range(in pixels) in firsttemporal decompositionlevel. The search range isdoubled with eachdecomposition

-maxsearchrange 64 Upper limit for searchrange

TABLE IIIParameter Settings for the H.264 AHM 2.0 Encoder

Parameter Value(CIF)

Input FileFrames To Be EncodedSource WidthSource HeightTrace FileRecon FileOutput File

“../Akiyo300_cif.yuv”300352288

“trace_enc.txt”“trace_rec.yuv”

“test.264”Search RangeNumber Reference Frames

161

Restrict Search RangeRD Optimization

21

Context Init Method 1

Rate Control EnableRate Control TypeBit rate

10

100Kbps

B.Quality measurement

The PSNR-Y is calculated as defined in [11]. In order to get aPSNR value for an entire sequence, the average of the PSNR -Y values of the individual frames is calculated. This is notonly one way to get a value for an entire sequence. Butanother method could be, for instance, to take the minimumof the individual PSNR-Y values (because a video sequencemay be evaluated based on the worst part). PSNR is based ona distance between two images [derived from the metric3mean square error (MSE)] and does not take into account anyproperty of the human visual system (HVS).

IV. EXPERIMENTAL RESULTS

In the experiment, the performance of the codec is checkedon Rate-Distortion basis. It is clear that due to the size of theexperiments and place constraints, not all results can bepresented. A subclass of the re sults is given in Table IV andTable V.

The coding efficiency of MC-EZBC is compared withH.264 with different sequences at different bit rates. MC -EZBC is a fully scalable coding architecture whi ch utilizesMCTF and wavelet filtering. The software available fordownload at the website of CIPR, RPI [7] is used for testingof the video material. On the other hand H.264 has nonscalable coding structure and the entire tests were done onLINUX based personal computer (AMD turion 64x2processor speed 1.9GHz and RAM 1GB) with Ubuntu 9.04installed and no other software running in the background.

The measurement results of both codecs can provide anassessment of the coding efficiency of current wavele t-basedcodecs compared to state-of-the-art single-layered codecs. Afirst general remark is the fact that, for certain bit rates, thereare no measurement points for MC -EZBC. MC-EZBC is notable to encode that particular video sequence at such lowtarget bit rates. In case of low bit rates, a codec may alsodecide to skip some frames.

TABLE IV

Average coding gain of MC-EZBC and H.264 between 500- 1000 Kbps

Video Codec Foreman (YSNR) dBMC-EZBC 37.90

H.264 38.06

For video sequences with a higher amount of movement (FO)

indicates that on an Average, H.264 JM 6.0 performs

significantly better than MC-EZBC in terms of PSNR-Y at

almost all bit rates. It is also observed that H.264 outperforms

well throughout the bitrate for High complexity.

TABLE V

Subset of Quality Measurements for Video CIF Sequences

Bit Rate(Kbps)

MC-EZBC H.264

Foreman Sequence Foreman Sequence

100 27.86 30.33400 34.88 35.73

1000 39.12 39.30

IV. CONCLUSION

In this paper, an overview was given of the rate distortionperformance of the two state of the art video codectechnologies in terms of YSNR. From the above results it isclear that the tools that are incorporated in the H.264 standardoutperform MC-EZBC. Although at around 1000 Kbps theperformance of MC-EZBC is comparable with that of H.264for high complexity sequences.

REFERENCES[1] M Ghanbari. Standard Codecs: Image Compression to Advanced video

Coding. IEE Telecommunications Series 2003.

[2] S.S. Tsai, motion Information Scalability for Interframe Wavelet VideoCoding, MS Thesis, National Chiao Tung University, Hs inchu,Tiawan, R.O.C., Jun.2003

[3] S.S. Tsai, motion Information Scalability for Interframe Wavelet VideoCoding, MS Thesis, National Chiao Tung University, Hsinchu,Tiawan, R.O.C., Jun.2003.

[4] J. W. Woods and P. S. Chen, “Improved MC-EZBC with Quarter-pixelMotion Vectors”, ISO/IEC/JTC1SC29/WG11 doc no. m8366, fairfax,May 2002.

[5] T. Wiegand, G Sullivan and A. Luthra. Overview of the H.264/AVCVideo Coding Standard. IEEE Trans. On CSVT, Vol.13, pp.560 -576,July 2003.

[6] I.E.G. Richardson. H.264 and MPEG-$ Video compression. Hoboken,NJ: Wiley, 2003.

[7] S.T. Hsiang and J. W. Woods. “Embedded Video Coding usingInvertible Motion Compensated 3 -D Subband/ wavelet filter Bank.”Signal Process.: Image Communication, vol. 16, pp.705-724, May2001

[8] MC-EZBC Software: www.cipr.rpi.edu/~golwea/mc_ezbc.htm

[8] T. Wiegand, H. Schwarz, A. Joch, F. Kossc ntini, G. J Sullivan,“Rate- Constrained Coder Control and Comparison of Video CodingStandards”, IEEE trans. Circuits sytems, video Technology , vol.13, no.7, pp- 688-703, July 2003.

[9] H.264.AVC Reference Software [Online]. Available:http://iphome.hhi.de/suehring/tml/download/

[10] Proposed Draft Description of Rate Control on JVT standard ,ISO/IECJTC1/SC29/WG11 and ITU-T SG16/Q.6, JVT-documentJVT-F086, Dec. 2002

[11] P. Chen. Fully Scalable Subband / Wavelet Coding. PhD Thesis,Rensselaer Polytechnic Institute, Troy, New York, May 2003.

www.cipr.rpi.edu/~golwea/mc_ezbc.htm

http://iphome.hhi.de/suehring/tml/download/

1

Abstract: Digital filtering technique is implemented using

general purpose digital signal processing chips. Audio and special

purpose digital filtering algorithms are designed on ASICs for

higher bit rate. This paper describes the implementation of IIR

filter algorithms based on field programmable gate arrays

(FPGAs). IIR Filter design shows significant reduction in the

computational complexity required to achieve a given frequency

response as compared to FIR filter for the same response. FPGA

based implementation includes higher sampling rates that are

available in traditional DSP chips. It produces a low cost along

with flexibility in design in comparison to ASIC. It follows

pipeline architecture that gives us the advantages of parallel

processing. We have observed and compared the filtering

characteristics of IIR filter of direct form-2 realization using

MATLAB by altering the bit length and also the order. We have

implemented the digital filter in Xilinx Spartan 3E kit using

VHDL. FPGA architectures are in-system programmable, the

configuration of the device may be changed to implement

different functionality as per requirement. Our work illustrate

that the FPGA approach is both flexible superior to traditional

approaches.

Keywords: ASIC, FPGA, IIR, FIR, VHDL, Pipeline

Architecture, Xilinx Spartan 3E

I. INTRODUCTION

A filter is used to modify an input signal in order to facilitate

further processing. A digital filter works on a digital input (a

sequence of numbers, resulting from sampling and quantizing

an analog signal) and produces a digital output. According to

Dr. U. Meyer-Baese [1], “the most common digital filter is the

Linear Time-Invariant (LTI) filter”. Designing an LTI

involves arriving at the filter coefficients which, in turn,

represents the impulse response of the IIR filter design. These

coefficients, in linear convolution with the input sequence will

result in the desired output. The linear convolution process

can be represented as [2]: The most common approaches to

the implementation of digital filtering algorithms are generally

implemented on digital signal processing chips for audio

applications and application-specific integrated circuits

(ASICs) for higher rates.

This paper describes the way of implementation of IIR digital

filtering algorithm on field programmable gate arrays

(FPGAs).Recent advancements in FPGA technology have

enabled these devices to be applied to a variety of applications

traditionally reserved for ASICs. FPGAs are well suited for

data path designs, such as those encountered in digital filtering

applications. The advantages of the FPGA approach to digital

filter implementation include higher sampling rates than those

are available from traditional DSP chips,[2] lower costs than

an ASIC for moderate volume applications, and more software

flexibility than the alternate approaches. In particular, multiple

multiply-accumulate (MAC) units may be implemented on a

single FPGA, which provides comparable performance to

general-purpose architectures which have a single MAC unit.

In comparison to FIR filter[3] IIR filter uses less MAC unit to

achieve the same frequency response resulting in lesser

memory requirement and less computational complexity for

IIR filter. The configuration of the FPGA device may be

changed to implement alternate filtering operations only by

altering the software, such as lattice filters and gradient-based

adaptive filters, or entirely different. In our project we have

implemented digital IIR filter using FPGA. IIR systems have

an impulse response function that is non-zero over an infinite

length of time. This is in contrast to finite impulse response

(FIR) filters[4], which have fixed-duration impulse responses.

To obtain the similar stability IIR filter requires less order

compared to FIR filter. IIR Filter is one of the Digital Filters

that is used mostly in Audio Signals Processing. One good

application of IIR filter technology is the generation and

recovery of dual tone multi-frequency (DTMF) signals used

by Touch-Tone telephones.

The rest of the paper is organized as follows: Section II

describes related works and Section III deals with proposed

architecture. Our scheme is evaluated by results obtained from

extensive simulation in Section IV. Finally, we conclude in

Section V.

FPGA BASAED IMPLEMENTATION OF

IIR FILTERS

1Anup Saha,

2Saikat Karak,

3Surajit Kangsabanik, and

4Joyita RoyChowdhury

Department of Electronics and Communication Engineering

4th Year, MCKV Institute of Engineering, [email protected],

[email protected],

[email protected],

[email protected]

2

II. RELATED WORKS

Customized VLSI chips influenced the former and most of the

researches implementing digital filter. The architecture of

these filters are largely determined by the target application.

Typical DSP chips like Texas instrument’s TMS320, Free

scale’s MSC81xx, Motorola’s 56000, Analog device’s ADSP-

2100 family efficiently performs filtering operations in audio

range. For higher frequency domain, CMOS and Bi-CMOS

technology is used. There are some disputes in the customized

chips. The biggest shortcoming is low flexibility as they are

application specific. Also, lack of adaptability in these chips is

severe. Typical custom approaches do not allow the function

of a device to be modified during the evaluation, for an

example, fault correction. The FPGA approach is therefore

necessary to provide the designing freedom. Many of the

popular FPGAs are in-system programmable, which allows

modification of the operation using simple programming. But

for filtering purposes FIR[3] filters have been commonly

used. In

this particular work, IIR filters are implemented as they

require fewer calculations and lesser memory requirement.IIR

filters also outperforms FIRs[5] for narrow transition bands.

They can also provide a better approximation for traditionally

analog systems in digital applications than competing filter

types.IIR filters are mainly used in audio applications such as

speakers and sound processing functions. In this work,

XILINX SPARTAN 3E series is used for implementing

various digital filtering algorithms. XILINX SPARTAN 3E

consists of reconfigurable combinational logic blocks with

multi input and output, router or switching matrix for

connection and buffers.

III PROPOSED ARCHITECTURE

IIR filter implementations on FPGA board illustrate that the

FPGA approach is both flexible and provides performance

superior to traditional approaches. Because of the

programmability of this technology, the examples in this paper

can be extended to provide a variety of other high

performance IIR filter realizations. Using powerful computer

based software tools to perform redundant calculations in the

filter design process enables a designer to achieve the best

design within the shortest time. While implementing a filter

on hardware, the biggest challenge is to achieve specified

system performance at minimum hardware cost. In this paper

we achieve this goal by designing the digital filter which also

gives better noise margin and less ageing effect of

components in comparison to Analog filter. One among the

hurdles is to understand, estimate and overcome where

possible, the effects of using a finite word length to represent

the infinite word length coefficients. Selecting a non

optimized word length[6] can result in the filter transfer

function being different from what is expected. The effects of

using finite word length representation can be minimized by

analytical or qualitative methods or simply by choosing to

implement higher order filters in cascaded or parallel form

Digitals filters[7] are often described and implemented in

terms of the difference equation that defines how the output

signal is related to the input signal. We have modeled the

equation as

0 * 0 * *

0

1* 2 * *

1[ ] ( [ ] [ 1] ......... [ ]

[ 1] [ 2] ........... [ ])

p

Q

y n b x n b x n b x n Pa

a y n a y n a y n Q

= + − + −

− − − − − − −

(1)

Where:

• is the feed forward filter order

• are the feed forward filter coefficients

• is the feedback filter order

• are the feedback filter coefficients

• is the input signal

• is the output signal.

Now from the above equation we modeled the transfer

function of IIR filter as

(2)

For hardware representation of the digital filter we have

modeled the transfer function by using adder, multiplier and

delay unit.

Figure 1: Direct Form-2 Structure of Digital Filter

A basic IIR filter consist of 3 main blocks-

(i) Adder (ii) Multiplier (iii) Delay unit

A Implementation of Adder

We have implemented this system using serial adder. A serial

adder is a binary adder that adds the two numbers bit-pair

( )

( )( )

2

2

1

1

2

2

1

10

1 −−

−−

++

++==

zaza

zbzbbzH

zX

zY

+

+

+

+

z-1

z-1

x(n) y(n)w(n)

w(n-1)

w(n-2)

b0

b1

b2-a2

-a1

3

wise. Each bit-pair are added in a single clock pulse. The

carry of each pair is propagated to the next pair.

B. Implementation of Multiplier

The multiplier has been configured to perform multiplication

of signed numbers in two’s complement notation We have

used signed multiplication where a n-bit by n-bit

multiplication takes place and result in a 2*n-bit value.

C. Implementation of Delay Unit

We have used shift register for the purpose of delay. A shift

register is a group of flip-flops set up in a linear fashion with

their inputs and outputs connected together in such a way that

the data is shifted from one device to another when the circuit

is active. (i) A provides the data movement function

(ii). A shift register “shifts” its output once every clock cycle.

IV SIMULATION RESULT

To check the response of proposed filter we have used Filter

Design and Analysis Tool (FDA Tool) which is a graphical

user interface (GUI) available in the Signal Processing

Toolbox of MATLAB for designing and analyzing filters. It

takes the filter specifications as inputs. Table 1 shows the

specifications of an IIR low pass elliptical filter of order 6.

Table 1: IIR filter specifications

Filter performance

parameter

Value

Pass band ripple

Pass band frequency

Stop band frequency

Stop band attenuation

Sampling frequency

0.5dB

11000 Hz

12000 Hz

35 dB

48000 Hz

A. Software Simulation

The sampling frequency is chosen as 4 times the stop band

and the filter has a steep transition band with a width of 1000

Hz. These specifications are fed as inputs to the FDA tool in

MATLAB R2009a. The tool performs the filter design

calculations using double precision floating point numeric

representation and displays the response of a IIR elliptical low

pass filter of order 6. Figure 2 shows the filter design window

of FDA tool, after completion of the design process.

Figure 2 Filter design using MATLAB FDA tool

We have designed the IIR filter of direct form-2 .Using

VHDL we have simulated and downloaded it in Xilinx

Spartan 3E kit. The response we have obtained by simulating

the VHDL code is shown below.

PASS BAND STOP BAND

4

Figure 3 The simulation output of IIR filter

in Xilinx ISE 7.01

The coding scheme that we are using is VHDL (Very high

speed integrated circuit hardware description language). Since

we have designed the filter in digital domain, so to

accommodate it in current existing analog system we have to

add a A/D converter before the system and a D/A converter

after the system.

B. Hardware Implementation

We have implemented digital IIR filter using FPGA based

Xilinx Spartan 3E kit which consists of an interior array of 64-

bit CLBs, surrounded by a ring of 64 input-output interface

blocks. The FPGA architecture is shown below.

Figure 4: Internal Block Diagram of FPGA Architecture

V. CONCLUSION

We have implemented the IIR filter in FPGA and our results

shows better improvement over existing filter design

architecture. In future we will implement our scheme for real

time application.

REFERENCE

[1] U. Meyer-Baese, Digital Signal Processing with Field Programmable Gate

Arrays Second Edition , Springer, p.109.

[2] U. Meyer-Baese, Digital Signal Processing with Field Programmable Gate

Arrays Second Edition , Springer, p.110.

[3] DUSAN M. KODEK, 1980, “Design of Optimal Finite Word length FIR

Digital Filters Using Integer Programming Techniques” IEEE Transactions on

Acoustics, Speech, and Signal Processing, Vol. ASSP-28, No. 3, JUNE 1980.

[4] Wonyong Sung and Ki-Il Kum, 1995, “Simulation-Based Word-Length

Optimization Method for Fixed-point Digital Signal Processing Systems”,

IEEE Transactions on Signal Processing, Vol. 43, No.12, December 1995.

[5] X. Hu, L. S. DeBrunner, and V. DeBrunner, 2002, “An efficient design for

FIR filters with Variable precision”, Proc. 2002 IEEE Int. Symp. on Circuits

and Systems, pp.365-368, vol. 4,May 2002.

[6] Y. C. Lim, R. Yang, D. Li, and J. Song, 1999. “Signed-power-of-two term

allocation scheme for the design of digital filters,” IEEE Transactions on

Circuits and Systems II, vol. 46, pp.577- 584, May 1999.

[7] S. C. Chan, W. Liu, and K. L. Ho, 2001, “Multiplier less perfect

reconstruction modulated filterbanks with sum-of-powers-of-two

coefficients,” IEEE Signal Processing Letters, vol. 8, no. 6,pp. 163-166, June

2001

1

Abstract—GSM acronym for global system for mobile uses

various encryption algorithms as A5/1/ 2/ 3.This is use to encrypt the information when transmit from mobile station to base station during communication. As stated that A5/1 is strong algorithm but it exhibit some weakness as basis on attacks happened on it. In A5/1 attacked on linearity complexity, clocking taps etc. So, in this paper proposed concept to improve A5/1 encryption algorithm to some extend by consideration or improve clocking mechanism of registers present in A5/1 and modified version of A5/1 is fast and easy to implement which make it ideal to future. Index Terms—GSM, encryption, A5/1 stream cipher, clock

controlling unit, correlation

I. INTRODUCTION

In wireless communication technology, wireless

communication is effective and convenient for sharing information[7]. GSM is a very good example of that wireless communication .But this information should be secure means nobody could interfere like eavesdropper. So, to protect our information cryptography play vital role. However, for sending information mobile station to base station there is air interface serious security threat prevention between communicating parties[10]. Then question arise how to protect while communication. For this there is encryption algorithm use in GSM as A5/x series. These algorithms used to encrypt voice and data over GSM link. The various different implementations A5/0 has no encryption, A5/1 is strong version, A5/2 weaker version targeting market outside Europe and at last A5/3 based in block ciphering strong version created as part of 3rd generation partnership project (3GPP)[5].

In this paper we explore about A5/1 that is also strong

version but exhibit weaker due attack happened on it. A5/1 based on stream ciphering[1] that is very fast doing bit by bit XOR and getting result. If we take simple encryption we could perform by take a plaintext bit XOR with any key that keep secret so choose any whatever got that is called cipher text and reverse process is called decryption.

A5/1 made up using linear feedback shift register. Initial value of LFSR is called seed because operation of the register is deterministic stream values produced by registers is

completely determined by its current or previous state. However, LFSR the well chosen feedback function can produce a sequence of bits which appear random and which has long cycle [2]. In cryptography, correlation attacks are a class of known plaintext attacks for breaking stream ciphers whose key stream is generated by combining the output of several linear feedback shift registers using a Boolean function. Correlation attacks [6] exploit a statistical weakness that arises from a poor choice of the Boolean function – it is possible to select a function which avoids correlation attacks, so this type of cipher is inherently insecure. It is simply essential to consider susceptibility to correlation attacks when designing stream ciphers of any type. In this paper proposed a new clocking mechanism for to avoid correlation attack on the place of m-rule i.e. majority rule used by A5/1 stream cipher. Form in different sections as follows. In section 2 description of A5/1 stream cipher is given. In section 3 correlation attack analysis. In section 4 proposed modified structure of A5/1 key stream generator. At last give conclusion.

II. DESCRIPTION OF A5/1

A5/1 is a stream cipher [11] provide key stream so called key stream generator. Made up of three linear feedback shift register of length 19, 22, 23 used to generate sequence of binary bits. GSM conversations are in form of frames as length of 228 bit i.e. 114 for each direction for encrypt/ decrypt data[4]. A5/1 initialize 64 bit key together with 22 frame number publicly known. It used linear feedback shift registers as R1, R2 and R3 to correspondence tap as (13, 16, 17, 18) contained by R1, (20, 21) by R2 and (7, 20, 21, 22) respectively. Each clocked using rule called as majority rule. Clocking tap considered as A, B, C to correspondence registers R1, R2 and R3 as R1 (8), R2 (10) and R310). Before register is clocked feedback is calculated by using linear operator i.e. XOR. The one bit shift to right (discarding the rightmost) bit produced by feedback location store leftmost locations of linear feedback shift registers. This cycle goes up to 64 times. This done on basis of clocking rule that register clocked irregularly according to majority rule. Majority rule uses on three clocking bits of LFSR’s A, B, C. Among clocking bit if one or more is 0, then m=0 whose value match with m that register will clock. Similarly, if one or more

Enhanced Clocking Rule for A5/1 Encryption Algorithm

Rosepreet Kaur Bhogal, ECE Dept., [email protected] , Nikesh Bajaj, Asst. Prof., ECE Dept., [email protected], Lovely Professional University -India

mailto:[email protected]


2

clocking bits is 1, then whose values match with m that will clock. At each clocking LFSR generate one bit which combined by linear function. In A5/1, the probability of an individual LFSR being clocked is 3/4. The clocking bit generates bit m defined as using Boolean algebra (A.B (+) B.C (+) A.C) as shown in figure 1 structure of A5/1 stream cipher and possible cases refer to table 1.

Figure 1: Structure of A5/1 stream cipher

Table 1: Possible cases of A5/1 register to clocked

Clocking bit

(A,B,C) Clocking bit

generated using m-rule

Register(s) Clocked

(0,0,0) 0 R1,R2,R3 (0,0,1) 0 R1,R2 (0,1,0) 0 R1,R3 (0,1,1) 1 R2,R3 (1,0,0) 0 R2,R3 (1,0,1) 1 R1,R3 (1,1,0) 1 R1,R2 (1,1,1) 1 R1,R2,R3

As shown in table 1 that possible cases of register to clocked according to m-rule explained. In this each register clocked with probability 3/4 [8] i.e. each output bit of this yield some information about the state of LFSRs [3]. Due to this the whole thing falls to a correlation attack and we determine bits.

III. ANALYSIS CORRELATION ATTACK

Analyzing stream cipher is easier as compare to block cipher. There is two main factor consider while designing any stream cipher that is correlation and linear complexity. Linear complexity is important because Berlekamp messey algorithm

can examine the state of LFSRs mean some of LFSRs bits are related to the output sequence generated. Linear complexity should be longer for more security but does not indicate for secure one. And further correlation immunity, higher linear complexity by combining the output sequence more non linear manner. So, insecurity arises that output of the combining function is correlated with output of individual LFSRs due this correlation attack exist. If observing the output sequence obtains information about internal state output. Using that could determine other internal states by this entire stream cipher generator is broken. Now, come on main point that A5/1 stream cipher is also using three LFSRs and clocking taps look strong but cryptographically weak shown by attacks. In the output of generator equal two output of LFSR2 75% times, if feedback is known, we can determine the initial bit of the LFSR2 and generate output sequence then count number of times LFSR2 output is agrees with output of generator. If two sequences will agree about 50% times then guess wrong if agree 75% then guessed right. Similarly, the output sequence agrees 75% times with LFSR3 using correlation. We could easily cracked by known plaintext attack. It is clear that basic idea behind A5/1 is good it passes statistical test example NIST test [12] but still have weakness that LFSRs length is short enough to made feasible for cryptanalysis. Make A5/1 longer as possible for more security.

IV. MODIFIED A5/1 STREAM CIPHER

The new clock control mechanism is proposed to overcome problem of getting probability of 3/4 explained. By proposed concept probability become 1/2 by using modified clock controlling unit. Consider three bits as A, B and C of respective registers R1, R2 and R3 called as clocking bits .The structure of proposed A5/ 1 stream cipher as shown in figure 2.

Figure 2: Modified stream cipher

3

A. Clocking controlling unit

In the new clock control mechanism each register has one clocking tap in bit 8 for R1, bit 10 for R2 and bit 10 for R3. The clocking bit generated by using Boolean algebra for expression as next write. In this used and gate due to that linear complexity also increase. In the text ¬ is not and (+) is XOR. As that equation given below:

y = ¬ A. (B (+) C) + A. ¬ (B (+) C) (1)

As above expression made by using different gate stated that consider clocking bits A, B and C to respective register. For each cycle register whose clocking tap is agree with y refer equation (1) that register clocked are shifted. For example A,B,C are clocking taps of R1,R2,R3 respectively then table 1 show the all possible combination for clocking.

Table 2: Possible cases modified stream cipher to clock register.

Clocking bit (A,B,C)

Clocking bit generated (y)

Register(s) Clocked

(0,0,0) 0 R1,R2,R3 (0,0,1) 1 R3 (0,1,0) 1 R2 (0,1,1) 0 R1 (1,0,0) 1 R1 (1,0,1) 0 R2 (1,1,0) 0 R3 (1,1,1) 1 R1,R2,R3

As refer table above at each cycle at least one register should clock else it stop that position where it not clocked. Consideration of these problems above mechanism made. Lets case 1: A=0, B=0, C=0 getting result by using equation (1) y=0 so whose register agree with value that clocked like R1, R2, R3 agree so all register clocked shift to right (discarding rightmost bit) .In case 2: A=0, B=0, C=1 using equation (1) y comes as 1 then R3 clocked and shifted. In case 3: A=0, B=1, C=0 using equation (1) y comes as 1 then R2 clocked and shifted. In case 4: A=0, B=1, C=1 using equation (1) y comes as 0 then R1 clocked and shifted. In case 5: A=1, B=0, C=1 using equation (1) y comes as 0 then R2 clocked and shifted. In case 6: A=1, B=0, C=1 using equation (1) y comes as 0 then R2 clocked and shifted. In case 7: A=1, B=1, C=0 using equation (1) y comes as 0 then R3 clocked and shifted. Last In case 8: A=1, B=1, C=1 using equation (1) y comes as 1 then all register clocked and shifted. Note, that if compare the possible outcomes to clock registers in table 1 and 2. In table 1 each cycle at least 2 registers are shifted with 75% probability. This reduced by 50% shown in table 2 where at least one registers shifted. The register bit that got output which is unrelated to state of LFSRs for 6 clock cycles.

V. CONCLUSION

A5/1 key stream generator is easy to implement and also efficient encryption algorithm used in communication application GSM. So, it exhibit weakness like length of LFSRs is short and basic correlation attack discussed in section 3. After analysis these things decreased the possibility of correlation attack. A5/1 modified structure has been given which is easy to implement and fast to do section 4. But if compare clocking mechanism based on majority rule then modified a5/1 stream cipher. However, it has proved that encryption algorithm is insecure based on m-rule. The enhancement proposed in new clock mechanism increase level of security and also decrease the possibility of attack called as correlation attack. As probability of linear feedback shift register clocked was 3/4 reduced up to 1/2. So, it prevents state identified by output sequence i.e. it gave bits which unrelated with output sequence up to 6 cycles. Hence, all shown by modified structure of A5/1 stream cipher in section 3.

ACKNOWLEDGMENT

This is part completion of masters as dissertation. The contribution in assorted ways to do work and the making of the deserved special mention. It is a pleasure to convey my gratitude to them all in my humble acknowledgment. Thanks to guide Mr. Nikesh Bajaj for his supervision, advice, and guidance for the every stage of this paper as well as giving me extraordinary experiences throughout the work. Above all and the most needed, he provided me unflinching encouragement and support in various ways. His truly intuition has made him as a constant oasis of ideas and passions in electronics, which exceptionally inspire and enrich my growth as a student. Last but not the least; I would like to thank my fellow being for the stimulating discussions and successful realization.

REFERENCES

[1] Instant cipher text-only cryptanalysis of GSM encrypted communication, Elad Barkan, Eli Biham, Nathan Keller, Advances in Cryptology – CRYPTO 2003.

[2] On LFSR based stream cipher , analysis and design , Patrik Ekdahl.

[3] A complex linear feedback shift register design for the a5 keystream

generator , Mohmed Sharaf , Hala A.K.Mansour , Hala H.Zayed , M L Shore.

[4] GSM Security and Encryption by David Margrave, George Mason

University.

[5] A Practical-Time Attack on the A5/3 Cryptosystem Used in Third Generation GSM TelephonyOrr Dunkelman, Nathan Keller, and Adi Shamir.

[6] A précis of the new attacks on GSM encryption Greg Rose,

QUALCOMM Australia.

4

[7] Communication security in gsm networks petr bouška, martin drahanský faculty of information technology, brno university of technology.

[8] Enhanced a5/1 cipher with improved linear complexity ,musheer

ahmad and izharuddin.

[9] Mobile networks security,tkl markus peuhkuri ,2008-04-22.

[10] Security enhancements for a5/1 without loosing hardware efficiency in future mobile systems,n. komninos, ‘b. honary, m. Darnel1

[11] Stream Ciphers for GSM Networks,Chi-Chun La and Yu-Jen Chen

Institute of Information Management,National Chiao-Tung University.

[12] http://csrc.nist.gov/groups/ST/toolkit/rng/documentation_software.ht

ml.

Rosepreet Kaur Bhogal pursuing the master’s degree in signal processing from Lovely Professional University, Punjab, India, in 2004. Currently, doing dissertation under the supervision of Mr. Nikesh Bajaj, the assistant professor of electronic department. Research interests include different aspects of cryptography like cryptographic assumptions and encryption algorithms use in GSM etc

Nikesh Bajaj received his bachelor degree in Electronics & Telecommunication from Institute of Electronics And Telecommunication Engineers, and he received his master degree in Communication & Information System from Aligarh Muslim University, India. Now, he is working in LPU as Asst. Professor, Department of ECE. Research interests include Cryptography, Cryptanalysis, and Signal

& Image Processing.


SIP0107-1

An Application of Kalman Filter in State Estimation of a

Dynamic System

Vishal Awasthi1, Krishna Raj

2

Abstract-- Most wireless communication systems for indoor

positioning and tracking may suffer from different error

sources, including process errors and measurement errors.

State estimation algorithm deals with recovering some desired

state variables of a dynamic system from available noisy

measurements. A correct and accurate state estimation of

linear or non-linear system can be improved by selecting the

proper estimation technique. Kalman filter algorithms are

often used technique that provides linear, unbiased and

minimum variance estimates of an unknown state vectors for

non-linear systems. In this paper we tried to bridge the gap

between the Kalman Filter and its variant i.e. Extended

Kalman Filter (EKF) with their algorithm and performance in

the state estimation of the car moving with a constant force.

Index Terms-- Stochastic filtering, Bayesian filtering,

Adaptive filter, Unscented transform, Digital filters.

1. INTRODUCTION

In the area of telecommunications, signals are the mixtures of

different Frequencies. Least squares method proposed by Carl

Friedrich Gauss in 1795 was the first method for forming an

optimal estimate from noisy data, and it provides an important

connection between the experimental and theoretical sciences.

Before Kalman, In 1940s, Norbert Wiener proposed his

famous filter called Wiener filter which was restricted only to

stationary scalar signals and noises. The solution obtained by

this filter is not recursive and needs the storing of the entire

pas observed data. Early 1960s, Kalman filtering theory, a

novel recursive filtering algorithm, was developed by Kalman

and Bucy which did not require the stationarity assumption

[1], [2]. Kalman filter is a generalization of Wiener filter. The

significance of this filter is in its ability to accommodate

vector signals and noises which may be non stationary. The

solution is recursive in that each update estimate of the state is

computed from the previous estimate and the new input data,

so, contrary to Wiener filter, only the previous estimate

requires storage, so Kalman filter eliminate the need for

storing the entire past observed data. Most of the existing

approaches need a priori kinematics model of the target for the

prediction. Although this predictor can successfully filter out

the noisy measurement, its parameters might be changed due

to different dynamic targets.

1Member IETE, Lecturer, Deptt. of Electronics & Comm. Engg., UIET., CSJM.University, Kanpur-24, U.P., (email: [email protected]) 2Fellow IETE, Associate Professor, Deptt. of Electronics Engineering,

H.B.T.I., Kanpur-24, U.P., (email: [email protected])

Information is usually obtained in the form of measurements

and the measurements are related to the position of the object

that can be formulated by Bayesian filtering theory. Since

Kalman filter theory is only applicable for linear systems and

in practice almost all practical dynamic systems (relation

between the state and the measurements) are nonlinear. The

most celebrated and widely used nonlinear filtering algorithm

is the extended Kalman filter (EKF), which is essentially a

suboptimal nonlinear filter. The key idea of the EKF is using

the linearized dynamic model to calculate the covariance and

gain matrices of the filter. The Kalman filter (KF) and the

EKF are all widely used in many engineering areas, such as

aerospace, chemical and mechanical engineering. However, it

is well known that both the KF and EKF are not robust against

modelling uncertainties and disturbances.

Kalman filtering is an optimal algorithm, widely applied in the

forecasting of system dynamic and estimating an unknown

state. Measurement devices are constructed in such a manner

that the output data signals must be proportional to certain

variables of interest. Knowledge of the probability density

function of the state conditioned on all available measurement

data provides the most complete possible description of the

state but except in the linear Gaussian case, it is extremely

difficult to determine this density function [6]. To enhance

these concepts, several algorithms were proposed using

parametric and non-parametric techniques such as Extended

Kalman Filter (EKF), Unscented Kalman filter (UKF)

respectively.

Unscented transformation (UT) is an elegant way to compute

the mean and covariance accurately up to the second order

(third for Gaussian prior) of the Taylor series expansion. Low-

order statistics of a random variable undergoes a non-linear

transformation y = g(x) and generate and propagate sigma

points through the nonlinear transformation-

Yi = g(X ) i , i = 0,……,2zx (1)

Where zx is the dimension of x. Scaling parameters are used to

control the distance between the sigma points and the mean .



SIP0107-2

In the presence of a random disturbances (white noise) or

when few system parameters change, the use of an adaptive

and optimal controller turns out necessary [3], [4]. In this

paper we are choosing to use Kalman filter as a controller.

This technique is based on the theory of Kalman's filtering [5],

it transforms Kalman's filter into a Kalman controller.

Simulation results show that the state estimation performance

provided by the robust Kalman filter is higher than that

provided by the EKF.

Recently, results on some new types of linear uncertain

discrete-time systems have also been given. Yang, Wang and

Hung presented a design approach of a robust Kalman filter

for linear discrete time-varying systems with multiplicative

noises [7]. Since the covariance matrices of the noises cannot

be known precisely, Dong and You derived a finite-horizon

robust Kalman filter for linear time-varying systems with

norm-bounded uncertainties in the state matrix, the output

matrix and the covariance matrices of noises [8]. Based on the

techniques Zhu, Soh and Xie gave a robust Kalman filter

design approach for the linear discretetime systems with

measurement delay and norm-bounded uncertainty in the state

matrix [9]. Hounkpevi and Yaz proposed a robust Kalman

filter for linear discrete-time systems with sensor failures and

norm-bounded uncertainty in the state matrix [10].

Currently many systems successfully using the Kalman filter

algorithms in different diverse areas such as the processing of

signals in mobile robot, GPS position based on neural network

[11], aerospace tracking [12], [13], underwater sonar and the

statistical control of quality.

In this paper the state of the car has been estimated through

Kalman filter and Extended Kalman filter which is moving

with a constant force. Dynamic model of the system is very

much nonlinear and hence firstly we linearized the nonlinear

system equations using EKF algorithm, secondly we perform

the time domain analysis of the dynamic model using

sampling time 10 millisec.

2. TECHNOLOGICAL DEVELOPMENT OF KALMAN

FILTER

A stochastic process is a family of random variables indexed

by the parameter and defined on a common probability

space. Bayesian models are a general probabilistic approach

for estimating an unknown probability density function

recursively over time using incoming measurements and a

mathematical process model [14].

The Kalman filter is an optimal observer in the sense that it

produces unbiased and minimum variance estimates of the

states of the system i.e. the expected value of the error

between the filter’s estimate and the true state of the system is

zero and the expected value of the squared error between the

real and estimated states is minimum.

2.1 WEINER FILTER

Weiner was as a pioneer in the study of stochastic and noise

processes [15] who proposed a class of optimum discrete time

filters during the 1940s and published in 1949. Its purpose is

to reduce the amount of noise present in a signal by

comparison with an estimation of the desired noiseless signal.

The Wiener process (often called as Brownian motion) is one

of the best known continuous-time stochastic process with

stationary statistical independence increments. The Wiener

filter uses the mean squared error as a cost function and

steepest-descent algorithm for recursively updating the

weights.

The main problem with this algorithm is the requirement of

known input vector correlation matrix and cross correlation

vector between the input and the desired response and

unfortunately both are unknown.

2.2 DISCRET KALMAN FILTER

A state estimate is represented by a probability density

functions (pdf) and the description of full pdf is required for

the optimal (Bayesian) solution but the form of pdf is not

restricted and hence it can’t be represented using finite number

of parameter [14], [16]. To solve this problem R.E. Kalman

designed an optimal state estimator for linear estimation of the

dynamic systems using state space concept [17], that has the

ability to adapt itself to non-stationary environments. It

supports estimations of past, present, and even future states,

and it can do so even when the precise nature of the modeled

system is unknown. A set of mathematical equations provides

an efficient computational (recursive) means to estimate the

state of a process, in such a way that minimizes the mean of

the squared error.

The filter is very powerful in several aspects:

The Kalman filter is an efficient recursive filter

algorithm that estimates the state of a dynamic system

from a series of noisy measurements and hence the filter

can be viewed as a sequential minimum mean square

error (MSE) estimator with additive noise.

It works like an adaptive low-pass infinite impulse

response (IIR) digital filter, with cut-off frequency

depending on the ratio between the process- and

measurement (or observation) noise, as well as the

estimate covariance.


SIP0107-3

The Kalman filter is a set of mathematical equations that

provides an efficient computational (recursive) means to

estimate the state of a process, in such a way that

minimizes the mean of the squared error.

2.2.1 DYNAMIC SYSTEM MODEL OF KALMAN

CONTROLLER

The Kalman filter is used for estimating or predicting the next

stage of a system based on a moving average of measurements

driven by white noise, which is completely unpredictable. It

needs a model of the relationship between inputs and outputs

to provide feedback signals but it can follow changes in noise

statistics quite well. The Kalman filter is an optimum

estimator that estimates the state of a linear system developing

dynamically through time.

Kalman filter theory is based on a state-space approach in

which a state equation models the dynamics of the signal

generation process and an observation equation models the

noisy and distorted observation signal. For a signal and

noisy observation , equations describing the state process

model and the observation model are defined as:

… (2)

… (3)

where, is the P-dimensional signal vector, or the state

parameter, at time k, M is a P × P dimensional state transition

matrix that relates the states of the process at times k –1 and k,

E is the control-input model which is applied to the control

vector uk , Jk (process noise) is the P-dimensional uncorrelated

input excitation vector of the state equation. Jk is assumed to

be a normal (Gaussian) process p(Jk)~N(0, Q), Q being the P ×

P covariance matrix of J(k) or process noise covariance. is

the M dimensional noisy observation vector, h is a M × P

dimensional matrix which relates the observation to the state

vector. is the M-dimensional noise vector, also known as

measurement noise, is assumed to have a normal

distribution p( )~N(0, R)) and R is the M ×M covariance

matrix of (measurement noise covariance).

2.2.2 KALMAN FILTER ALGORITHM

Initially the process state is estimated at some time and then

obtains feedback in the form of (noisy) measurements. The

equations for the Kalman filter fall into two groups:

Time Update (Predictor) Equations: which are

responsible for projecting forward (in time) the current

state and error covariance estimates to obtain the a priori

estimates for the next time step.

Measurement Update (Corrector) Equations: which are

responsible for the feedback i.e. for incorporating a new

measurement into the a priori estimate to obtain an

improved a posteriori estimate.

Sawaragi et al. [18] examined some design methods of

Kalman filters with uncertainties and observed that under poor

observabilty and numerical instability Kalman filters do not

work properly.

2.2.3 FLOW CHART OF TIME & MEASUREMENT

UPDATE ALGORITHM

Time Update:

Measurement Update:

Initialize the state estimate

Take the Initial measurement sample

at k instant i.e.

Update state estimate with new

measurement

Calculate the state estimate to

next sample time i.e.

Update the sample of new

sample time i.e.

Initialize error covariance

Compute The Kalman Gain

Update the Error Covariance

Update the sample of new sample

time i.e.


SIP0107-4

Figure 1. Recursive Updation Procedure for Discrete Kalman Filter

2.3 EXTENDED KALMAN FILTER (EKF)

The extended Kalman filter (EKF) is the nonlinear version of

the Kalman filter that linearizes the non-linear measurement

and state update functions at the prior mean of the current time

step and the posterior mean of the previous time step,

respectively.

2.3.1 EXTENDED KALMAN FILTER ALGORITHM

Time Update:

(1) Project the state ahead :

… (4)

(2) Project the error covariance ahead:

… (5)

The time update equations project the state and covariance

estimates from the previous time step k-1 to the current time

step k.

Measurement Update:

(1) Compute the Kalman gain:

… (6)

(2) Update estimate with measurement zk :

… (7)

(3) Update the error covariance:

… (8) The measurement update equations correct the state and

covariance estimates with the measurement . An important

feature of the EKF is to propagate or “magnify” only the

relevant component of the measurement information.

2.3.2 LIMITATIONS OF EKF ALGORITHM

Although the EKF is computationally efficient recursive

update form of the Kalman filter still it suffers a number of

serious limitations [14]:

(1) Linearized transformations are only reliable if the error

propagation is well approximated by a linear function. If this

condition does not hold, then the linearized approximation

would be extremely poor and hence it causes its estimates to

diverge altogether.

(2) The EKF does not guarantee unbiased estimates and also

calculate error covariance matrices that do not necessarily

represents the true error covariance.

3. PROBLEM DESCRIPTION

We consider a dynamic system i.e. a car with a constant force

moving with a constant acceleration and follow a linear/ non-

linear motion. To estimate the state i.e. position, the

continuous time state space model is discretised with a 10

millisec sampling time.

3.1 MATHEMATICAL MODELING OF SYSTEM

In a dynamic system, the values of the output signals depend

on both the the past behavior of the system and also on

instantaneous values of its input signals. The output value at a

given time t can be computed using the measured values of

output at previous two time instants and the input value at a

previous time instant.

Figure 2. Free body diagram of car-model

Horizontal and Vertical motion is govern by the following

equations:

(9)

(10)

(11)

(12)

(13)


SIP0107-5

0 10 20 30 40 50 60 70 80 90 100-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

Time (sec)C

ar

positio

n

true position

measured position

estimated position

0 10 20 30 40 50 60 70 80 90 100

-0.1

-0.05

0

0.05

0.1

Time (sec)

err

or

difference between true position and measured position

difference between true position and estimated position

(14) For steady state analysis, considering has very small value in

between -10 to 10 radians.

(15)

(16)

Figure 2 illustrates the modeled characteristics of the car. The

front and rear suspension are modeled as spring/damper

systems. This model include damper nonlinearities such as

velocity-dependent damping. The vehicle body has pitch and

bounce degrees of freedom. They are represented in the model

by four states: vertical displacement, vertical velocity, pitch

angular displacement, and pitch angular velocity. The front

suspension influences the bounce (i.e. vertical degree of

freedom).

Dynamic model of the system is very much nonlinear and

hence firstly we linearized the nonlinear system through EKF

algorithm.

4. SIMULATION RESULTS

The mean and covariance of the posterior distribution were

recorded at each time step and compared to the true estimate.

For comparison, the data was also processed with EKF. Figure

shows the mean error of different filters it can be seen that

EKF works quite well and optimal for linear measurements

regardless of the density function of the error. The mean errors

did not vary much between different filters. However, EKF

performed quite well even with large blunder probabilities. A

comparative chart is given below to demonstrate the Error in

estimating the state through KF and EKF.

TABLE I

Comparative Chart of State (Position) Values with Kalman Filter

Time

(sec)

True

state

(mt)

Measured

state (mt)

Estimated

state

(mt)

Error (true -

measured

position)

(mt)

Error (true -

estimated

position)

(mt)

1 0.0125 0.0223 0.0011 -0.0098 0.0114

30 0.0221 0.0213 0.024 0.0008 -0.0019

60 0.0746 0.0712 0.0743 0.0034 0.0003

90 0.1567 0.1751 0.1712 -0.0184 -0.0145

100 0.1988 0.1824 0.2113 0.0164 -0.0125

Figure 3. Comparison of True, Measured & Estimated position with KF

Figure 4. Comparison of Error between true, measured & estimated position value with KF

TABLE II

Comparative Chart of State (Position) Values with Extended Kalman Filter

Time

(sec)

True

state

(mt)

Measured

state (mt)

Estimated

state

(mt)

Error (true -

measured

position)

(mt)

Error (true -

estimated

position)

(mt)


SIP0107-6

0 10 20 30 40 50 60 70 80 90 100-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

Time (sec)

Car

positio

n

true position

measured position

estimated position

0 10 20 30 40 50 60 70 80 90 100

-0.1

-0.05

0

0.05

0.1

Time (sec)

err

or

difference between true position and measured position

difference between true position and estimated position

1 0.0012 0.0181 0.0010 -0.0169 0.0002

30 0.0186 0.0251 0.022 -0.0065 -0.0034

60 0.746 0.744 0.731 0.0020 0.015

90 0.147 0.189 0.148 -0.042 -0.0010

100 0.1791 0.181 0.183 -0.0019 -0.0039

Figure 5. Comparison of True, Measured & Estimated position with EKF

Figure 6. Comparison of Error between true, measured & estimated position

value with EKF

5. CONCLUSION

In this paper, a detailed overview of Kalman filter and

Extended Kalman Filter to improve inadequate statistical

models, nonlinearities in the measurement is presented.

Simulation results show that the performance of the Extended

Kalman filter is higher than that of the Kalman filter and

conclude that the Kalman filter-based scheme is capable of

effectively estimating the position errors of moving target to

make future state and measurement predictions more accurate

and therefore improving the accuracy of target positioning and

tracking. Further efforts in kalman filter will lead to improved

estimation of signal arrival time and more accurate target

positioning and tracking.

This work can be used as theoretical base for further studies in

a number of different directions such as tracking system, to

achieve high computational speed for multi-dimensional state

estimation.

REFERENCES

[1] Kalman, R. E.,” A new approach to linear filtering and prediction

problems”, Journal of Basic Engineering Transactions of the

ASME, Series D, Vol. 82, No. 1, pp. 35-45, 0021- 9223, 1960. [2] Kalman, R. E. & Bucy R. S.,” New results in linear filtering and

prediction problems”, Journal of Basic Engineering Transactions

of the ASME, Series D, Vol. 83, No. 3, pp. 95- 108, 0021-9223, 1961.

[3] Mudi Rajani, K. & Nikhil Pal, R. ,”A robust self-tuning scheme for

PI and PD type fuzzy controllers”, IEEE transactions on fuzzy systems, Vol. 7, No. 1, ( February 1999) 2-16, 1999.

[4] Zdzislaw, B. ,”Modern control theory”, Springer-Verlag Berlin Heidelberg, 2005.

[5] Eubank, R. L.,”A Kalman filter primer”, Taylor & Francis Group, 2006.

[6] D. L. Alspach, and H. W. Sorenson, “Nonlinear Baysian

estimation using Gaussian sum approximations,” IEEE Trans. Automatic Cont., vol. 17, no. 4, pp. 439-448, Aug. 1972.

[7] Yang, F.; Wang, Z. & Hung, Y. S.,” Robust Kalman filtering for

discrete time-varying uncertain systems with multiplicative noises”, IEEE Transactions on Automatic Control, Vol. 47, No. 7,

pp.1179-1183, 0018-9286, 2002.

[8] Dong, Z. & You, Z. ,” Finite-horizon robust Kalman filtering for

discrete time-varying systems with uncertain-covariance white

noises”, IEEE Signal Processing Letters, Vol.13, No. 8, pp. 493-496, 1070-9908, 2006.

[9] Zhu, X.; Soh, Y. C. & Xie, L,” Design and analysis of discete-time

robust Kalman filters. Automatica”, Vol. 38, pp. 1069-1077, 0005-1098, 2002.

[10] Hounkpevi, F. O. & Yaz, E. E.,” Robust minimum variance linear

state estimators for multiple sensors with different failure rates”, Automatica, Vol. 43, pp. 1274-1280, 0005-1098, 2007.

[11] Wei Wu and Wei Min, “The mobile robot GPS position based on

neural network adaptive Kalman filter”, International Conference on Computational Intelligence and Natural Computing, IEEE, pp.

26-29, 2009

[12] Y. Bar-Shalom and Li X.R., Estimation and Tracking: Principles, Techniques, and Software, Artech House, 1993.

[13] Y. Bar Shalom, X.-R. Li, and T. Kirubarajan, Estimation With

Applications to Tracking and Navigation. New York: Wiley, 2001.

[14] Y. C. Ho and R. C. K. Lee, “A Bayesian approach to problems in

stochastic estimation and control,” IEEE Trans. Automatic Cont.,

vol. AC-9, pp. 333-339, Oct. 1964. [15] P. Maybeck, Stochastic Models, Estimation and Control. New

York: Academic Press, vol. I, 1979.

[16] S. Haykin, Adaptive Filter Theory. Prentice-Hall, Inc., 1996. [17] H. J. Kushner, “Approximations to optimal nonlinear filters,” IEEE

Trans. Automatic Cont., AC-12(5), pp. 546-556, Oct. 1967.

[18] Sawaragi, Yoshikazu and Katayama, Tohru, “Performance Loss And Design Method of Kalman Filters For Discrete-time Linear

Systems With Uncertainties”, International Journal of Control,

12:1, 163 — 172, 1970.


SIP0107-7

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-

27 2011

SIP0108-1

Wideband Direction of Arrival Estimation by using

Minimum Variance and Robust Maximum Likelihood Steered

Beamformers: A Review

SANDEEP SANTOSH

1, O.P.SAHU

2, MONIKA AGGARWAL

3

Astt. Prof., Department of Electronics and Communication Engineering1

,

Associate Prof., Department of Electronics and Communication Engineering2

,

National Institute of Technology , Kurukshetra, 1,2

Associate Prof., Centre For Applied Research in Electronics (CARE)3 ,

Indian Institute of Technology, New Delhi.3

INDIA

[email protected] http://www.nitkkr.ac.in

Abstract

Beamforming of sensor arrays is a

fundamental operation in Sonar, Radar

and telecommunications. The Minimum

Variance Steered Beamformer and

Robust Maximum Likelihood Steered

Beamformer are two important methods

for Wideband Direction of Arrival

Estimation. This research paper presents

a comparative study between Minimum

Variance Steered Beamformer and

Robust Maximum likelihood Steered

Beamformer . MV beamformers can

place nulls in the array response in the

direction of unwanted sources even if

located within a beamwidth from the

source of interest provided that the

interfering signals are uncorrelated with

the desired one. A steered wideband

adaptive beamformer optimized by novel

concentrated maximum likelihood (ML)

criterion in the frequency domain can

be considered and this ML beamforming

can reduce the typical cancellation

problems encountered by adaptive MV

beamforming and preserve the

intelligibility of a wideband and colour

source signal under interference,

reverberation and propagation

mismatches. The Minimum Variance

Steered Beamformer (MV-STBF) and the

use of Steered Covariance Matrix is

illustrated and the Robustness of the

Maximum Likelihood Steered

Beamformer (ML-STBF) by using a

Modified Newton Algorithm is

explained.

Key-Words :- Wideband Direction of

Arrival(DOA) Estimation, Minimum

Variance , Robust Maximum Likelihood,

Steered Beamforming, Covariance

Matrix.

1. Introduction

Beamforming of sensor arrays is a

fundamental operation in Sonar, Radar

and telecommunications. The

development of minimum

variance(MV) adaptive beamforming

has taken in the last three decades . The

bad effect of multipath on MV

beamforming is the cancellation of the

desired signal even if coherent


http://www.nitkkr.ac.in/


27 2011

SIP0108-2

component is very weak at the output of

the generalized sidelobe canceller. The

classical cure for this phenomenon lies

in the definition of a set of linear or

quadratic constraints on adaptive part of

the beamformer, based on proper

modeling of the array perturbations .

Wideband arrays are less sensitive to

signal cancellation because reflections

exhibit a delay of several sampling

periods with respect to direct (useful)

path.

Prefiltering of the array outputs and

proper constraints on the weight vector

helps in contrasting cancellation . It is

not clear if the MV criterion is optimal

in ensuring best possible reconstruction

of a wideband signal of interest e.g

intelligibility in case of speech.

Therefore , there is a need of frequency

domain wideband beamformer, starting

from the concepts of focusing matrices

and steered beamforming that aligns the

component of the direct path signal

along the same steering vector as in a

narrowband array.

This beamformer uses a single set of

weights for the entire bandwidth but the

adaptation is made on the basis of

concentrated maximum likelihood (ML)

cost function derived by using a

stochastic Gaussian assumption on the

frequency samples of the beamformer

outputs .

It is found that the ML solution does not

depend on any prefiltering applied to

array outputs provided that none of the

subband components are nulled out.

Nonconvexity of the derived ML cost

function makes it unsuitable for classical

Newton optimization .

Hence , a second order algorithm is

developed starting from a procedure

originally introduced for fast neural

network training which recasts the ML

problem as an iterative least square

minimization . The robust ML wideband

beamformer also incorporates a norm

constraint to reduce the risk of signal

cancellation under propagation

uncertainities. [1],[2],[3],[4]

2 . Narrowband and Wideband MV

Beamforming

An array with M sensors receives the

signal of interest s(t) radiated by a point

source ,whose position is characterized

by generic coordinate vector p. The

propagating medium and the sensors are

assumed linear even if they may

introduce temporal dispersion on s(t).

The direct or the shortest path of the

wave propagation is characterized by (M

Χ 1) vector of impulse responses hd(t,

p),starting from t=td. Multiple delay and

filtered copies of s(t) generated by

multipath ,reverberation and scattering

are also received by the array and can be

globally modeled by the vector (M X 1)

vector hr(t,p) impulse responses starting

from t =tr > td. Interference and noise are

statistically independent of s(t) and are

conveniently collected in the (M X 1)

vector v(t). Therefore, the (M X 1) array

output vector or snapshot x(t) obeys the

continuous time equation,

x(t) = ∫td ∞ hd(т,p) s( t - т)dт +

∫tr∞hr(т,p)s(t-т)dт+v(t). (1)

This model represents a large number of

real world environments, encountered in


27 2011

SIP0108-3

telecommunications, remote sensing ,

underwater acoustics, seismics and

closed room applications . The objective

of beamforming is to recover s(t) in the

presence of multipath, noise and

interference terms given the knowledge

of direct path response only. In fact hd(t,

p) is accurately described by analytical

and numerical methods or measured

under controlled conditions but hr(t,p)

depends on great number of

unpredictable and time-varying factors

.An alternative view is to consider a

reference model containing only the

terms related to direct path and to

develop robust algorithms that are able

to bound in a statistical sense the effects

of sufficiently small perturbations on the

final estimate.

2.1 Discrete Time Signal Model

Array outputs x(t) are properly

converted to the baseband , sampled and

digitized. Under general assumption

equation (1) is written in discrete time as

the vector FIR convolution as,

x(n) = Σk=Nd1 Nd2

hd(k,p) s(n-k) +

Σk=Nr1Nr2

hr(k,p)s(n-k)+v(n) (2)

The relationships among discrete-time

transfer functions of (2) and their analog

counterparts in (1) are quite involved

depending on the receiver architecture.

In many cases, the delays of reflection

with respect to direct path exceed the

Nyquist sampling period of the baseband

signal (i.e Nr1 > Nd2 so that hd(n,p) and

hr(n,p) do not overlap.

2.1.1. Narrowband MV Beamforming

The narrowband arrays obey (2) with

known hd(n,p) = hd(p)u0(n-Nd), an

unknown hr(n,p) = hr (p) u0(n-Nd), and

Nd1= Nd2 = Nr1 = Nr2. Hence, we have,

x(n)=[hd(p)+hr(p)]s(n-Nd)+v(n). (3)

The sources are assumed white within

the sensor bandwidth. Reflection delays

must be less than the sampling period so

that the spectrum at the sensor outputs

remains white. The sequence s(n) is

conveniently scaled so that |hd(p) | = 1.

A( M X 1) complex valued weight

vector w is applied to the baseband

snapshot x(n) to recover a spatially

filtered signal y(n,w) where

y(n,w)=wHx(n) (4)

according to some optimality criteria.

For example, in the classical MV-DR

beamformer, w^ solves the LS

minimization problem as,

w^=argwmin[Σn=1N|y(n,w)|

2] (5)

subject to hd(p)Hw = 1 using N

independent

snapshots. The output is finally

computed as

y(n,w^1)=y0(n)+w^1Hy1(n)

2 (6)

2.1.2 Wideband MV Beamforming

The extensions of MV beamforming to

wideband arrays have been proposed

several times in the past following

either time-domain or frequency domain

approaches . The main drawback of

wideband MV beamforming is the high

number of free parameters to adapt

which produces slower convergence


27 2011

SIP0108-4

,high sensitivity to mismodeling and

strong misadjustment for short

observation times . The introduction of

large number of linear or quadratic

constraints may enlarge these issues at

the expense of a reduced capability of

suppressing interference . The very

complex and largely unpredicatable

structure of reverberant fields can make

it impossible to specify an effective set

of constraints.

2.1.3 Steered Adaptive Beamformer

An interesting tradeoff between

complexity and efficacy is obtained by

the Wideband Steered adaptive

beamformer (STBF) which was

introduced on the basis of coherent

focusing technique ,Coherent Signal

Subspace Processing (CSSM) developed

by Wang and Kaveh . In the frequency

domain formulation of the STBF , the

sequence x(n) (n= 1,2,….,N) is

partitioned into L nonoverlapping

blocks of length J that are separately

processed by a windowed Fast Fourier

Transform (FFT).Finally the frequency

domain output is computed as ,

y(ωj,l,w1)=y0(ωj,l)+w1Hy1(ωj,l). (7)

Where y0(ωj,l ) and y1(ωj,l ) can be

computed.[1],[2],[3],[10]

3 . Limitations of MV-STBF

It is known that most wideband signals

of interest are strongly correlated in

time .Effects of temporal correlation of

s(n) on MV-STBF are two fold i.e.

a) The impulse responses of the

direct path and reflections are

often well separated in time in

wideband environments.

However, the signal replicas may

still cancel the desired signal s(n-

N0) if multipath delay does not

extend the correlation time of

s(n).

b) It is not obvious anymore that the

MV criterion leads itself to good

solution in wideband scenarios.

When the beamformer is a

steered off-source , it mostly

captures background noise which

often is considered temporally

white .In this case the MV

criterion realizes a particular ML

estimator i.e when the

beamformer points towards a

correlated source, the quality of

the output is influenced by the

spectra of the source itself and of

the inteference.

These two problems are strictly

related .An optimal cost function

should impose a trade off on

performance at different frequencies

when a single weight vector w1 is

used for the entire bandwidth. A

wideband beamformer aiming to

preserve the signal intelligibility

should minimize the noise plus

interference power in those subbands

where the useful source spectrum has

valleys. The ineference nulling

becoms less important near the

spectral peaks of the useful signal

whose strength may be adequate to

mask the unwanted

components.[1],[2],[3],[10]

4. Maximum Likelihood STBF


27 2011

SIP0108-5

In order to overcome the drawbacks

of MV beamforming, a general

stochastic model can be formulated

and exploited to derive the proper

ML estimator of w, subject to the

given constraint CHw = d. Using the

reduced space formulation, this

constrained ML problem can still be

converted into unconstrained

maximization of the likelihood

function of the beamformer output

containing the useful signal plus

noise and interference residuals.

Although, the crucial assumption for

validity of this model is that the

multipath terms are uncorrelated

with the direct path , it can be shown

that the resulting ML estimator is

also effective in decorrelating the

multipath terms having a delay

higher than one sampling period,

independently of the source

spectrum.

In particular , for the central limit

theorem,y(ωj,l,w1) can be considered

to be independent ,zero-mean

,circular Gaussian random variable

regardless of the original distribution

of the signal and interference but

characterized by a different variance

ζj2 in each subband. In reverbrent

fields and in presence of coloured

sources ,such as speech and sonar

targets, these conditions can be

further approached by proper

prewhitening of highly correlated

components that are present in both

y0(n) and y1(n). The scaled global

negative log-likelihood of the STBF

output can be written as ,

L(w1)=Σj=j1j2

[log(ζj2)+

(Σl=1L|y(ωj,l,w1)|

2/Lζj

2)] (8)

After neglecting irrelevant additive

constants, the Optimal weights w1 are

finally found as,

Lc(w1)=Σj=j1j2

log[1/L(Σl=1L|y(ωj,l,)|

2] (9)

w1ˆ=argw1 minLc(w1) (10)

The wideband ML-STBF using a

single w1 must instead optimize

equation (9) by coherently combining

information from all frequencies which

results in a highly nonlinear

problem.[1],[10]

5. Properties of the Cost Function

The fuunction Lc(w1) is clearly

nonconvex, due to presence of

logarithms, and it is not even guaranted

to be lower bounded or to have a unique

minimizer . Nonconvexity hampers the

use of classical Newton optimization

algorithms when initialized far from the

global minimum. But,if a ζjˆ2

(w1)

becomes zero during adaptation,

indicating perfect signal cancellation in

the jth

subband ,then Lc(w1) →-∞ and the

minimization cannot proceed further . If

multiple bins have ζj ˆ2

(w1) = 0 , then

many local minima may likely occur and

a descent algorithm may get stuck before

reaching a global optimum. The two

theorems associated with Cost Function

show the effect of lack of information

occurring when L ≤ Mb. Despite of

these limitations , two other properties

of equation (9) appear extremely

interesting from both theoretical and

practical viewpoints i.e Scaling

invariance in the frequency domain and

the Link with Cepstral Analysis. A

decisive advantage of the ML –STBF

over cepstral processing lies in the


27 2011

SIP0108-6

intrinsic linearity of beamforming

which is highly desirable when dealing

with music, speech or digital

transmission of data.[1]

6. Robustness of the ML-STBF

The cost function given by equation (9)

grows logarithmically i.e. very slowly

with respect to each subband error

variance ζjˆ2

(w1). This behaviour is

typical of statistically robust estimators

that are able to cope with outliers in the

data or significant deviation from the

assumed probabilistic model . As a result

, the performance of traditional

frequency domain MV beamforming

might result quiet suboptimal in the

presence of coloured source and

interferences. The following quadratic

constraint is deduced ,

|w1|2≤(1-δ/εmax)

2≈γ

2 (11)

Equation (11) theoretically justifies the

common practice of limiting the norm of

w1 in MV beamforming and furnishes a

guideline for properly choosing the

parameter γ2 . [1],[3]

7. Iterative LS Minimization of

Lc(w1)

Signal nonstationarity and moving

sources require short observation times

and fast numerical conversion to the

optimal solution, therefore function

Lc(w1) should be minimized by second

order Newton like algorithm in order

to be competitive with the MV approach

in real time applications. Therefore, we

have,

w1[q]

= argw1min

[Σj=j1j2

(Σl=1L|y(ωj,l,w1)|

2)/Lζj

ˆ2(w1

[q-1] )]

(12)

For q= 1,2…, subject to | w1[q]

|2

≤

γ2 until convergence is achieved.

Equation (12) is a standard quadratic

ridge regression problem.[1]

8 . Modified Newton Algorithm

The ML–STBF can be interpreted as a

two layer perceptron with constrained

weights. Therefore, the algorithms

developed for fast neural network

training should be highly effective .The

descent in the neuron space is adapted in

this wok. In this case ,the minimization

of Lc(w1) is still converted into an

iterative LS procedure but using a single

system matrix for all steps .Only matrix

sums and products are performed at each

iterations . The modified Newton

algorithm consists of three steps as 1)

Data Preconditioning, 2) System Setup

and 3) Main Loop[1]. The summary of

the algorithm is given below :

8.1 Algorithm Summary

Step 1 ) Collect N= LJ snapshots x(n)

for n= 1,..N.

Step 2) Compute frequency domain

snapshots x(ωj , l) for l= 1,2,..L and j=

1,2,..J using a windowed FFT of Length

J applied to L sequential blocks of x(n).

Step 3) For j = j1 , j2 , synthesize

focusing matrices Tj and compute

focused snapshots xf (ωj , l)= Tjx( ωj , l ).

Step 4) For j = j1 , j2 build y0(ωj , l)=w0xf

(ωj , l) and y1((ωj , l)=CH

┴ xf (ωj , l) .


27 2011

SIP0108-7

Step 5) For each j , build regularized

matrices Rj,µ.

Step 6) Compute the system matrix F

and the vector g.

Step 7) Initialize w1[0]

with all zeros and

small complex random values .

Step 8) For q=1,2,, iterate until

convergence to w1ˆ

and solving of the

LS system.

Step 9) Compute the optimal weight

vector w1ˆ

and or compute the output

sequence y(ωj,l, w1).[1]

9. Steered Minimum Variance

Beamforming (MV-STBF)

The Steered Minimum variance (STMV)

is defined by finding the beamformer

weight vector w which minimizes the

beam power given by equation (13)

subject to the constraint that the

processor gain is unity for a broad-band

plane wave in direction θ. The problem

alternatively can be viewed as one of

estimating the dc component of the

STCM steered in direction θ by means

of minimum variance (MV) approach. In

either case , this technique has the effect

of choosing w to minimize the power

contribution from the sources and noise

not propagating from direction θ. The

solution is derived by several authors

and the resulting STCM based spatial

spectral estimate , denoted as STMV

method ,is given by

Zstmv(θ)=[1HR(θ)

-11]

-1 (14)

Where 1 is an M X 1 vector of ones. A

finite –time estimate , Zˆstmv( θ ) of Zstmv

( θ ) is obtained by substituting the

estimate Rˆ(θ) in place of R(θ) in

equation (14). The comparison of the

STMV method and CSDM based

minimum variance distortionless

response (MVDR) method is made

possible by expressing R(θ) as a sum of

cross-spectral density matrices.

Substituting the value of R(θ) in

equation (14) gives,

Zstmv(θ)=1/[1H[ Σk=l

h Tk(θ)K(ωk)Tk(θ)

H]

-1

(15)

Observe that in the case when h=l

equation (15) can be rewritten as ,

Zstmv(θ)=[1HTl(θ)K(ωl)Tl(θ)

H]

-1 (16)

Where the identity Tk(θ)-1

= Tk(θ)H. Note

that Tl(θ)H.1 = Dl(θ) ,the direction vector

of an arrival at frequency ωl and

direction θ. Hence , equation (16)

becomes ,

Zstmv(θ)=[Dl(θ)HK(ωl)

-1Dl(θ)]

-1 (17)

The equation (17) is precisely the

MVDR or maximum likelihood spatial

spectral estimate. Thus, in narrowband

case the STMV reduces to conventional

MVDR method. For broad-band sources,

the MVDR beampower is obtained by

summing narrow-band beampowers

over the band of interest i.e.

Zmvdr(θ)=Σk=lh[Dk(θ)

HK(ωk)

-1Dk(θ)]

-1 (18)

With a finite –time observation ,an

estimate ,Z^ mvdr(θ), can be computed by

substituting

K^(ωk) for its true value K(ωk). The

comparison of equations (15) and (18)

reveals the essential difference between

the STMV and MVDR methods for

broad-band signals . Specially in


27 2011

SIP0108-8

equation (15), cross-spectral density

matrices are coherently averaged prior

to matrix inversion while in equation

(18) the matrix inversion is applied to

individual narrow-band CSDM’s prior to

averaging . While asymptotically ,the

STMV method is strictly suboptimal

,when only a limited number of data

snapshots are available. The coherent

averaging in equation (15) provides a

more statistically stable matrix to invert

thus facilitating more accurate spatial

spectral estimation. We can estimate the

steered covariance matrix by calculating

R(θ),K(ωk),K^(ωk), and R

^(θ).

The steered covariance matrix is

estimated as ,

R(θ)=Σk=lhTk(θ)K(ωk)Tk(θ)

H (19)

where K(ωk) is given as,

K(ωk) = E{Y(k) Y(k)H

} is the

conventional unsteered CSDM at

frequency ωk . The above equation

expresses the STCM in the same form

as coherently focused covariance matrix

proposed by Wang and Kaveh for the

case where all the sources in a field are

in a single group, unresolved by

conventional Direct-Spread (DS)

beamformer. In the coherent subspace

method, the equation (19) is appropriate

only in the single group case since just

one focused covariance matrix is formed

where each source has a rank one

characterization. In the STCM methods

R(θ) is calculated for each steering

direction θ of interest. The need to

compute R(θ) for each θ makes STCM-

based methods more computationally

intensive than coherent subspace

methods. However, it avoids the

problem of source location bias

resulting from errors made in forming

focusing matrices. The relationship

between K(ωk), k=l,….,h and R(θ) as

given in the equation (19) suggests a

natural way of estimating R(θ) by using

finite time CSDM estimates , K^(ωk). A

common method of forming K^(ωk) from

discrete-time sensor outputs is to divide

the T second observation into N

nonoverlapping segments of ΔT

seconds each and then apply the Discrete

Fourier Transform (DFT) to obtain

uncorrelated frequency domain vectors,

Yn(k), for each segment n=1,…,N. The

cross-spectral density matrix at

frequency ωk is then estimated by

taking ,

K^(ωk)=1/NΣn=1

NYn(k)Yn(k)

H (20)

Substituting K^(ωk) in place of its true

value ,K(ωk) in equation (19) gives an

estimate of the steered covariance matrix

R^(θ) such that,

R^(θ)=Σk=l

hTk(θ)K

^(ωk)Tk(θ)

H (21)

Note that the efficient computation of

R^(θ) from equation (21) can be

achieved by using the Fast Fourier

Transform (FFT) to obtain the Yn(k)

from discrete-time sensor outputs.[10]

The various steps used to perform the

STMV method are as follows:

1) Form the estimated cross –

spectral density matrices , Kˆ(ωk)

, over the frequency band of

interest as given in (20).

2) Compute the estimated steered

covariance matrices , Rˆ(θ ),for

each steering direction θ as given

in (21).

3) Compute Rˆ(θ )

-1 and form

Zˆstmv(θ) = [1

H R

ˆ(θ )

-1 1]

-1 for


27 2011

SIP0108-9

each steering direction θ to

obtain a broad-band spatial

power spectral estimate as shown

by equation (14). Note that the

estimation of the source location

is achieved by determining the

peak positions of the spatial

power spectral estimate Zˆ

stmv(θ) .[10]

10 . Conclusion

The ML-STBF and MV-STBF were

tested for 1)Far- Field point sources , 2)

Mediterranean Vertical array data and

3) Reverberant room . All these

demonstrated the higher performance

and robustness of novel ML-STBF over

MV-STBF.[1],[4],[5],[6] .

The ML-STBF is based on concentrated

ML cost function in the frequency

domain and trained by fast modified

Newton algorithm. The ML cost

function performs a direction –

dependent spectral whitening on the

beamformer output . The computational

cost of the ML-STBF and MV-STBF are

comparable in most cases and dominated

by the common preprocessing of

wideband array data[7],[8],[9][10].

References

[1] E.D.Claudio and R Parisi “ Robust

ML Wideband beamforming in

reverberant fields” , IEEE Transactions

on Signal processing, vol 51,no.2, pp

338-349,Feb. 2003.

[2] E.D.Claudio and R Parisi “ Waves:

Weighted Average of Signal Subspaces

for robust wideband direction finding”,

IEEE Transactions on Signal

processing, vol.49, pp. 2179-2121, Oct.

2001.

[3] D.H. Johnson and D.E. Dudgeon,

“Array Signal Processing”, Englewood

Cliffs, NJ:Prentice Hall,1993.

[4] J.L. Krolik, “ The performance of

matched –field beamformers with

Mediterranean vertical array data,”

IEEE Transactions on Signal

Processing, vol. 44,pp. 2605-2611,Oct .

1996.

[5] G.Xu, H.P.Lin, S.S.Jeng, W.J.Vogel ,

“Experimental studies of spatial

signature variation at 900 MHz for smart

antenna systems,” IEEE Transactions on

Antennas propagation, vol. 46, pp.953-

962, July 1998.

[6] M.Agrawal and S.Prasad ,“ Robust

Adaptive beamforming for wideband

moving and coherent jammers via

uniform linear arrays,” IEEE

Transactions on Signal processing , vol.

47, pp. 1267-1275, Aug. 1999.

[7] Q.G. Liu, B.Champagne and

P.Kabal,, “A microphone array

processing technique for speech

enhancement in a reverberant space,”

Speech Communication, vol. 18, pp.

317-334,1996.

[8] B.Champagne, S.Bedard and

A.Stephenne, “Performance of time-

delay estimation in the presence of room

reverberation,” IEEE Transactions on

Speech ,Audio processing, vol.4, pp.148-

152, Mar. 1996.

[9]D.N.Swingler ,“ A low complexity

MVDR beamformer for use with short


27 2011

SIP0108-10

observation times,” IEEE Transactions

on Signal processing, vol. 47, pp. 1154-

1160, Apr. 1999.

[10] J.Krolik and D.N.Swingler,“

Multiple wideband source location

using steered covariance matrices,”

IEEE Transactions on Acoustics, Speech

and signal processing,vol.37, pp. 1481-

1494,,Oct. 1989.

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27

2011

SIP0109-1

Electricity Generation by People Walk through

Piezoelectric Shoe: An Analysis

1.Dr. Monika Jain,

2.Ms. Usha Tiwari,

3.Mohit Gupta,

4.Magandeep singh Bedi

1.Member IEEE, IETE, Professor-Dept of Electronics & Instrumentation Engg,

Galgotias College of Engineering & Technology,Greater Noida,UP, INDIA 2.

Assistant Professor-Dept of Electronics & Instrumentation

Galgotias College of Engineering & Technology,Greater Noida,UP, INDIA .3&4.

B.Tech, 4th

year student Dept of Electronics & Instrumentation

Galgotias College of Engineering & Technology,Greater Noida,UP, INDIA 1

[email protected] 2

[email protected]. 3

[email protected].

[email protected]

Abstract— In todays, high crisis of

electrical power, there has been an

increasing demand for low-power and

portable-energy sources due to the

development and mass consumption of

portable electronic devices.

Furthermore, the portable-energy

sources must be associated with

competitive market price,

environmental issues and other imposed

regulations. These tremendous demands

support lots of research in the area of

portable-energy generation methods. In

this scope, piezoelectric materials has

always been chosen as an attractive

choice for energy generation and

storage. In this paper, different

techniques are being explored and

analysed to generate electricity by usage

of piezoelectric crystal. In-depth study

and analysis to describes the use of

piezoelectric polymers in order to utilize

and the best optimisation of energy

from people-walk and the fabrication of

a smart shoe, capable of generating and

accumulating the energy has peen

presented.

Keywords— Energy harvesting, PZT,

uninterrupted power supplies.

I. INTRODUCTION

Piezoelectric generators are based on

piezoelectric effect i.e. the ability of

certain materials to create electrical

potential when responding to mechanical

changes. In real time application, when

compressed or expanded or otherwise

changing shape a piezoelectric material

will output certain voltage. This effect is

also possible in reverse in the sense that

putting a charge through the material will

result in it changing shape or undergoing

some mechanical stress. These materials

are useful in a variety of ways. Certain

piezoelectric materials can handle high

voltage extremely well and are useful in

transformers and other electrical

components. Piezoelectric crystals are

boon of sensor technology field as it might

be possible to make motors, reduce

vibrations in sensitive environments, used

as an energy collector and in many more

applications. In today’s power crisis world,

one of the most interesting area is energy

collection and generation. In this paper, a

cheap and smart however a reliable

mechanism to generate energy capable

enough to charge our phone, MP3 players

has been explored and analysed. An

interesting methodology of power




http://renewable-energy-future.com/diy-alternative-energy/solar-stirling-engine-generator


2011

SIP0109-2

generation through the walking steps of

human being is reviewed and presented

here. The sole of shoe could be constructed

of piezoelectric materials and every step a

person took would begin to generate

electricity. This smart mechanism of

generation of electricity through shoe sole

could then be stored in a battery or used

immediately in personal electronics

devices.

II. LITERATURE REVIEW

The most common methodology of

shoe power generators include dielectric

elastomers [1] and piezoelectric ceramics

[2,3]. The elastomer demonstrated

significant power output but it required a

large bias (2 kV) and the heavy

construction is likely to negatively affect

the user experience. The power harvesting

shoe reported in [2] and [3] uses

piezoelectric ceramic bi-morphs for power

harvesting. As piezoelectric materials were

employed, no bias voltage was needed.

However, a complex PZT/metal bi-morph

was required and the power output after

dc/dc conversion and regulation was low

(<1 mW) [2]. The schematic of

microstructured piezoelectric polymer film

that is used for the power generation as

shown in below figure1.

Microstructure Piezoelectric Polymer

Film

To increase the transducer power output,

the film is rolled into a 1-cm thick stack of

approximately 120 layers. The generated

charge per step is Q e33Fh/Yt where

e33 is the piezoelectric coefficient for

compression, F = mg is the force exerted

by the foot determined by the mass of the

user m and the gravity constant g =9.81

m/s2, Y is the Young’s modulus for the

film, h is the total transducer height, t is

the film thickness, and N is the number of

film layers in the transducer [6]. The

piezoelectric polymer power generator and

conversion circuit provide over 2 mW of

regulated power at 4.5 V. The transducer is

low cost, ecological, and soft suitable

shock absorption inside heel. The design

of electromagnetic generators that can be

integrated within shoe soles is described.

In this way, parasitic energy expended by a

person while walking can be tapped and

used to power portable electronic

equipment. Designs are based on discrete

permanent magnets and copper wire coils,

and it is intended to improve performance

by applying micro-fabrication

technologies. The proposed approach is

good in an aspect that voltage level are

comparable with piezoelectric generator

however, its complex circuitry is a

constraint. Vibration based generators

using three types of electromechanical

transducers: electromagnetic [8],

electrostatic [9], and piezoelectric [10-11]

have also been presented.

In all of these methods, vibrations consist

of a traveling wave in or on a solid

material, and it is often not possible to find

a relative movement within the reach of a

small generator. Therefore, one has to

couple the vibration movement to the

generator by means of the inertia of a

seismic mass.

Energy Storage Density Comparison

Type Practical

Maximu

m

Aggressiv

e

Maximum

Piezoelectric 35.4 335

Electrostatic 4 44

Electromagneti

c

24.8 400


2011

SIP0109-3

There are two types of piezoelectric

signals that can be used for technological

applications: the direct piezoelectric effect

that describes the ability of a given

material to transform mechanical strain

into electrical signals and the converse

effect, which is the ability to convert an

applied electrical solicitation into

mechanical energy. The direct

piezoelectric effect is more suitable for

sensor applications, whereas the converse

piezoelectric effect is most of the times

required for actuator applications[12].

High-performance films, prepared by

researchers [14-15] is also explored. In this

the electromechanical properties of the

film were improved by a treatment that

consists of pressing, stretching, and poling

at a high temperature [14].

III. CONCLUSION

In this paper, an analysis for Electricity-

Genration for low power devices is done.

Different methodologies for generation of

electricity is reviewed and presented. We

analysed that some of the methodologies

are not feasible due to too much circuitry

in real time portable charging and some

are feasible but they are on an analysis

stage. We have found that piezoelectric

generators implanted in shoe can provide

a great achievement if collaboratively an

effort is made to bring a commercial

battery charger for low power house

devices, just by utilization of walking steps

of a person.

REFERENCES

[1] Roy Kornbluh, “Power from plastic:

how electroactive polymer artificial

muscles will improve portable ower

generation in the 21st century military,”

Presentation [Online],

Available:http://www.dtic.mil/ndia/2003tri

service/korn.ppt

[2] John Kymisis, et.al., “Parasitic power

harvesting in shoes” in Proc. of the 2nd

IEEE Int. Conf. On Wearable Computing,

Pittsburgh. PA, pp. 132-139, 19-20 Oct.

1998.

[3] S. Shenck and J. Paradiso, “Energy

scavenging with shoe-mounted

piezoelectrics”, IEEE Micro, Vol. 21, pp.

30-42, May-June, 2001.

[4] P. Miao, et.al., “Micro-Machined

Variable Capacitors for Power

Generation”, in Proc. Electrostatics

Edinburgh, UK, 23-27 Mar. 2003.

[5] Mitcheson, P.D.; Green, T.C.;

Yeatman, E.M.; Holmes, A.S.,

"Architectures for vibration-driven

micropower generators," Journal of

Microelectromechanical Systems, vol.13,

no.3, pp. 429-440, June 2004.

[6] Ville Kaajakari, “Practical MEMS”,

Small Gear Publishing, 2009.

[7] M.Duffy & D.Carroll,”

Electromagnetic generators for power

harvesting” 35th AM^ IEEE Power

Electronics Specialists Conference

Aachen, Germany, 2004 ;pp. 2075-2081

[8]M. El-hami, P. Glynne-Jones, M.

White, M. Hill, S. Beeby, E. James, D.

Brown, and N. Ross, “Design and

fabrication of a new vibration-based

electromechanical power generator,”

Sens. Actuators A, Phys., vol. 92, no. 1–3,

pp. 335–342, Aug. 2001.


2011

SIP0109-4

[9] M. Miyazaki, H. Tanaka, G. Ono, T.

Nagano, N. Ohkubo, T. Kawahara, and K.

Yano, “Electric-energy generation using

variablecapacitive resonator for power-

free LSI,” in Proc. ISLPED, 2003, pp.

193–198.

[10] C. Keawboonchuay and T. G. Engel,

“Maximum power generation in a

piezoelectric pulse generator,” IEEE

Trans. Plasma Sci., vol. 31, no. 1, pp. 123–

128, Feb. 2003.

[11] J. Yang, Z. Chen, and Y. Hu, “An

exact analysis of a rectangular plate

piezoelectric generator,” IEEE Trans.

Ultrason., Ferroelectr., Freq. Control, vol.

54, no. 1, pp. 190–195, Jan. 2007.

[12] T. Sterken, P. Fiorini, K. Baert, R.

Puers, and G. Borghs, “Anelectret-based

electrostatic micro-generator,” in Proc.

Transducers,2003, pp. 1291–1294.

[14] V. Sencadas, R. Gregorio Filho, and

S. Lanceros-Mendez, “Processing and

characterization of a novel nonporous

poly(vinilidene fluoride) films in the β

phase,” J. Non-Cryst. Solids, vol. 352, no.

21/22, pp. 2226–2229, Jul. 2006.

[15] S. Lanceros-Mendez, V. Sencadas,

and R. Gregorio Filho, “A new

electroactive beta PVDF and method for

preparing it,” Patent PT103 318, Jul. 19,

2006.

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

SIP0110-1

Abstract-This paper describes non-parametric approach

for spectral analysis using three different window

functions with three power spectrum estimation

techniques. Window functions used are Hamming,

Blackman & Modified Bartlett Hanning window for

power spectral estimation using Periodogarm, Welch &

autocorrelation as estimation methods. The role of these

different Window functions has been analyzed in terms

of spectral leakage and scalloping loss. And the

objective of using three different techniques for power

spectral density estimation is to find out the BW of the

signal. This work has been further extended to spectral

analysis in voice signals to detect the fundamental

frequency of the speaker. The frequency domain

cepstrum analysis for voiced speech segments is also

used. This is conventional method of fundamental peak

picking i.e. fundamental frequency or pitch. The voice

segments of different speakers with minimum 30dB

SNR as a threshold has been taken and cepstrum has

been analyzed using different window functions.

Index Terms— Autocorrelation, Cepstrum, MBH,

periodogram, PSD, Pitch, spectrum, welch.

I. INTRODUCTION

ower Spectral Estimation is the method of

determining the power spectral density (PSD) of

a random process that provides the information about

the structure of spectrum. The purpose of estimating

the spectral density is to detect any periodicities in

the data, by observing peaks at the frequencies

corresponding to these periodicities. For spectrum

estimation the two approaches are linear and non-

linear methods. In linear approach, the task is to

estimate the parameters of the model that describes

the stochastic process. While the non-linear

estimation is based on the assumption that the

observed samples are wide sense stationary with zero

mean[3]

. So the spectral analysis of a noise like

random signal is usually carried out by nonlinear

methods like Periodogram, Welch etc. To analyze the

non-linear method first we have to see the role of

different window functions. In spectral analysis the

discontinuity resulting from the periodic extension of

the signal which gives rise to the leakage at the end

points and the high side lobe levels result in false

frequency detection which is reduced by the use of

window functions. The bin crossover that results in a

signal detection loss (scallop loss) due to the reduced

signal level at frequency points of the bin centers.

The window function modifies the frequency

responses which are used to reduce the bin crossover

losses.

Pitch is the fundamental parameter of speech[11]

.

Pitch detection is one of the important tasks of speech

signal processing[5],[7],[9]

. Pitch i.e. fundamental

frequency of voice signals (varies from 40Hz to

600Hz). Accurate representation of voiced/unvoiced

character of speech plays an important role in Voice

activity detection(VAD), coding, synthesis, speech

training, speech and speaker recognition systems and

vocoders[6],[8]

. For accurately detect and estimate the

fundamental frequency of a speaker we use cepstrum [5]

analysis which is also called spectrum of spectrum.

It is used to separate the excitation signal (pitch) and

transfer function (voice quality). One of these

algorithms that show good performance for quasi-

periodic signals is the cepstrum (CEP) algorithm

However, its ability to separate the source signal

(that conveys pitch information) from the vocal

tract response fails wherever the speech frame cannot

be contemplated as just the result of a linear

convolution between both components, as occurs

transitions or non-stationary speech segments, or

Spectral and Cepstral analysis Using Modified

Bartlett Hanning Window

P

Rohit Pandey1, Rohit Kumar Agrawal

1, Sneha Shree

1

Department of Electronics & Communication Engineering 1 Jaypee University of Engineering & Technology, Guna, MP, India [email protected],[email protected]


SIP0110-2

when the recorded speech signal includes additive

noise [5],[7]

.

II. WINDOW FUNCTIONS

These are the window functions used for spectrum and

cepstrum analysis.

MBH[1] are used for the estimation techniques. Modified

Bartlett-Hanning (MBH) window is extended to the form [1]

w(t,α)=α-(4α-2)|t|+(1-α)cos2πt; |t| ≤0.5, 0.5≤α <1.88 -(1)

Where α is index parameter.

Blackman window:

W(n)=.42-.50cos((2πn)/(M-1))+.08cos((4πn)/(M-1)) -(2)

Hamming window:

W(n)=.56-.42cos((2πn)/(M-1)) -(3);

Where M is window length and n is number of samples

n=0:M-1;

III. SPECTRAL ESTIMATION TECHNIQUES

Periodogram method[3]

as sequence x(a) is to be made

finite by using window functions. Now the windowed

sequence x(n) is autocorrelated and periodogram is

calculated by-

21

0

][1

)(N

n

jnj

N enxN

eI

-(4)

Where N is the length of the finite sequence.

Welch method [2] in Welch process firstly the Sectioning of

data is done according to the sequence length. For each

section of length ‗m‘ we calculate a modified periodogram

[3] by

21

0

][1

)(N

n

jn

r

jr emxN

eIN

-

(5)

of each section and average it.

Auto-correlation method [3] is extracting the similarities

between the signals which is given by-

mN

n

xx nxmnxm1

0

][][][

-(6)

The sequence x(a) is windowed and autocorrelated and psd

is calculated by -

)( fxx

1

0

][N

n

jn

xx em . –(7)

IV. CEPSTRUM ANALYSIS

Cepstrum is a frequency domain analysis of voiced

speech segments. Real cepstrum is inverse Fourier

transform of log magnitude of Fourier transform.

Algorithm is: signal FT -> abs()-> log-> IFT.

It is similar to spectral analysis of signal but in the

cepstrum we take the logarithm of the spectrum[10]

.

This is due to the fact that speech signals are quasi-

periodic in nature and only spectrum analysis cannot

be very useful for the characteristic feature extraction

of the voice signals. While calculating cepstrum we

have taken speech sample of 25ms and with the

sampling frequency fs of 8000Hz.

V. APPROACH FOR CEPSTRUM ANALYSIS

Normalization windowing*

IFFT

Log

abs

FFT

windowing

LPF

Voice

Signal


SIP0110-3

Smooth cepstrum

This procedure for smoothing the composite log

spectrum to obtain the log spectral envelop is referred

as cepstral smoothing.

VI. SIMULATION AND RESULTS

Sinusoidal signal has been taken as input signal with two

different frequency components and amplitudes

A*sin(2πft);A=[2 1.5];f=[150;175].

To estimate the spectrum of a noisy signal the sinusoidal

signal must be added with random sequence which is

generated in MATLAB.

Fig.1 (Peridogram method)

TABLE-I(bandwidth of

periodogram[4])

Fig.2(Welch method)


SIP0110-4

TABLE-II(bandwidth of

welch[4])

Fig.3 (Autocorrelation method)

TABLE-III(bandwidth of auto-correlation[4])

For cepstrum analysis we have taken the voice

samples of two speakers of duration 25 ms each.

And after that we have passed the voice samples

through the low pass filter of cut-off frequency

0.15*pi. We have used low pass filter here to

eliminate the high frequency additive noise and

analyzed the cepstrum of the filtered voice samples

for pitch detection.

Windowing is done after IFFT for smoothing

of cepstrum and detecting the clear cepstral peaks.

Fig.4(Cepstrum Analysis)

Fig. 5(Smooth Cepstrum)

VII. CONCLUSION

The aim is to detect and estimate the signal [3]. For the

identification of two different frequency components in a

presence of noise different threshold levels has been taken

starting from -3dB [4]. In periodogram method (Fig.1) -

3dB,-6dB and -15dB of threshold is taken and it is

observed from the results (TABLE-I) that at -3dB two

sinusoidal peaks are not detected and beyond -15dB noise

is detected. Same is the case with autocorrelation PSD

method that at -3db (TABLE-III) no peaks are detected but

we can detect our signals up to -20dB in comparison to

peridogram method. But in the case of Welch method

(TABLE-II), (Fig.2) detection at -3dB is possible i.e. the

minimum threshold to detect the signal. As the Fourier

transform of sinusoidal signal is an impulse so in the Welch

method (TABLE-II) using MBH window, side lobe levels

are more suppressed and width of main become

narrower(tending to an impulse Fig 2) than Hamming and

Blackman windows .

The above comparison shows that the Welch method is

giving better results in comparison to periodogram and

autocorrelation method. Using Welch method with MBH

window gives more accurate results than Hamming and

Blackman.


SIP0110-5

Taking ―HELLO‖ as an iterative voice sample for

two speakers, we have estimated the average pitch. It is

observed that in cepstrum (fig.4) error in pitch detection is

more than smooth cepstrum (fig. 5). Now, considering 0.4

as threshold level we see that the periodicity in smooth

cepstrum is more distinguished and hence pith can easily be

detected. This can be further used in voice recognition

systems in order to minimize false acceptance rate (FAR)

and false rejection rate (FRR).

ACKNOWLEDGMENT

The authors acknowledge the valuable guidance of Prof

Rajiv Saxena who helped to improve the quality of the

paper.

REFERENCES

[1] IEEE TRANSACTIONS ON SIGNAL PROCESSING,

On the Modified Bartlett-Hamming Window (Family) by

Jai Krishna Gautam, Arun Kumar, and Rajiv Saxena, pg

2098-2102, VOL. 44, NO. 8, AUGUST 1996.

[2] IEEE TRANSACTIONS ON AUDIO AND

ELECTROACOUSTICS, The Use of FFT for the

Estimation of Power Spectra: A Method Based on Time

Averaging Over Short, Modified Periodograms PETER D.

WELCH ,pg 70-73, VOL. AU-15, NO. 2, JUNE 1967.

[3] Digital Signal Processing by David J. Defatta, Joseph

G. Lucas, Willam S. Hodgkiss..

[4] F. J. Hams, ―On the use of windows for harmonic

analysis with the discrete Fourier transform,‖ Proc. ZEEE,

vol. 66, pp. 51-83, Jan. 1978.

[5] The Cepstrum Guide: A Guide to Processing by

Donald G. Childers,David P. Skinner and Robert C.

Kemerait, PROCEEDINGS OF THE IEEE, VOL. 65,

NO. 10, OCTOBER 1977, pp 1428-1443.

[6] Signal Modeling Techniques in Speech Recognition by

JOSEPH W. PICONE, SENIOR MEMBER, IEEE, PROCEEDINGS OF THE IEEE, VOL. 81, NO. 9,

SEPTEMBER 1993, pp 1215-1247.

[7] ―Ceptrum pitch determination,‖ J. Acoust. SOC.

Am., vol. 41, no. 2, pp. 293-309, Feb. 1967.

[8] A Tutorial on Text-Independent Speaker Verification, EURASIP Journal on Applied Signal Processing 2004:4,

430–451.


SIP0110-6

[9] http://en.wikipedia.org/wiki/Cepstrum

[10] Spectrum Analysis in Speech Coding, James L.

Flanagan Senior Member IEEE, IEEE TRANSACTIONS

ON AUDIO AND ELECTROACOUSTICS, VOL. AU-

15, NO. 2, JUNE 1961, pp 66-69.

[11] http://en.wikipedia.org/wiki/Human_voice

1

A 3D APPROACH TO FACE-EXPRESSION RECOGNITION

Akshay Gupta , Ananya Misra , Hridesh Verma , Garima Chandel-Member IEEE

ABES Institute of Technology, Ghaziabad-201009, India.

[email protected] , [email protected] , [email protected] , [email protected]

ABSTRACT: Face recognition has been in research for the last couple of decades. With the advancement of 3D imaging technology, 3D face recognition emerges as an alternative to overcome the problems inherent to 2D face recognition, i.e. sensitivity to illumination conditions and positions of a subject. But 3D face recognition still needs to tackle the problem of deformation of facial geometry that results from the expression changes of a subject. To deal with this issue, a 3D face recognition framework is proposed in this paper. It is combination of three subsystems: expression recognition system, expressional face recognition system and neutral face recognition system. A system for the recognition of faces with one type of expression (smile) and neutral faces was implemented and tested on a database of 30 subjects. The results proved the feasibility of this framework.

Index Terms- face recognition, databases, neutral face, smiling face, image acquisition.

I. INTRODUCTION

Mostly the face recognition attempts that have been made use of 2D intensity images as the data format for processing. In spite of the success reached by 2D recognition methods, certain problems still exist. 2D face images not only depend on the face of a subject, but also depend on imaging factors, such as the environmental illumination and the orientation of the subject. These variable factors can become the cause of the failure of the 2D face recognition system. With the advancement of 3D imaging technology, more attention is given to 3D face recognition, which is robust with respect to illumination variation and posing orientation. In [1], Bowyer et al. provide a survey of 3D face recognition technology. Mostly the 3D face recognition systems treat the 3D face surface as a rigid surface. But actually, the face surface is deformed by different expressions of the subject, which causes the failure of the systems that treat the face as a rigid surface. The involvement of facial

expression has become a big challenge in 3D face recognition systems. In this paper, we propose an approach to tackle this problem, through the integration of expression recognition and face recognition in a system.

II. EXPRESSION AND FACE RECOGNITION

From the psychological point of view, it is still not known whether facial expression recognition information aids the recognition of faces by human beings. It is found that people are slower in identifying happy and angry faces than they are in identifying faces with neutral expression.The proposed framework involves an initial assessment of the expression of an unknown face, and uses that assessment to assist the progress of its recognition. The incoming 3D range image is processed by an expression recognition system to find the most appropriate expression label for it. The expression labels include the six prototypical expressions of the faces, which are happiness, sadness, anger, fear, surprise and disgust, plus the neutral expression. According to different expressions, a matching face recognition system is then applied. If the expression is recognized as neutral, then the incoming 3D range image is directly passed to the neutral expression face recognition system, which uses the features of the probe image to directly match those of the gallery images, which are all neutral, to get the closest match. If the expression found is not neutral, then for each of the six expressions, a separate face recognition subsystem should be used. The system will find the right face through modelling the variations of the face features between the neutral face and the face with expression. Figure 1 shows a simplified version of this framework. This simplified diagram only deals with the smiling expression, which is the most commonly displayed by people publicly.

III. DATA ACQUISITION AND PROCESSING

To test the approach proposed in this model, a database, which includes 30 subjects, was built. In

2

this database, we test the different processing of the two most common expressions, i.e., smiling versus neutral. Each subject participated in two sessions of the data acquisition process, which took place in two different days. In each session, two 3D scans were acquired with a Polhemus Fastscan scanner. One was a neutral expression; the other was a happy (smiling) expression. The resulting database contains 60 3D neutral scans and 60 3D smiling scans of 30 subjects.

Figure1- Simplified framework of 3D face recognition

The left image in Figure 2 shows an example of the 3D scans obtained using this scanner, the right image is the 2.5D range image used in the algorithm.

Figure 2- 3D surface (left) and a mesh plot of the converted range image (right)

IV. EXPRESSION RECOGNITION

The face expression is a basic mode of nonverbal communication among people. In [5], Ekman and Friesen proposed six primary emotions. Each possesses a distinctive content together with a unique facial expression. These six emotions are happiness, sadness, fear, disgust, surprise and anger. Together with the neutral expression, they also form the seven basic prototypical facial expressions.

In our experiment, we aim to recognize social smiles, which were posed by each subject. Smiling is

generated by contraction of the zygomatic major muscle. This muscle lifts the corner of the mouth obliquely upwards and laterally, producing a characteristic “smiling expression”. So, the most distinctive features associated with the smile are the bulging of the cheek muscle and the uplift of the corner of the mouth, as shown in Figure 3. The following steps are followed to extract six representative features for the smiling expression:-

1. An algorithm is developed to obtain the coordinates of five characteristic points in the face range image as shown in Figure 3. A and D are the extreme points of the base of the nose. B and E are the points defined by the corners of the mouth. C is in the middle of the lower lip.

Figure 3- Illustration of features of a smiling face versus a neutral face

2. The first feature is the width of the mouth, BE, normalized by the length of AD. Obviously, while smiling the mouth becomes wider. The first feature is represented by mw.

3. The second feature is the depth of the mouth (The difference between the Z coordinates of point B point C and point E point C) normalized by the height of the nose to capture the fact that the smiling expression pulls back the mouth. This second feature is represented by md.

4. The third feature is the uplift of the corner of the mouth, compared with the middle of the lower lip d1 and d2, as shown in the figure, normalized by the difference of the Y coordinates of point A point B and point D point E, respectively and represented by lc.

5. The fourth feature is the angle of line AB and line DE with the central vertical profile, represented by ag.

6. The last two features are extracted from the semicircular areas shown, which are defined by using line AB and line DE as diameters. The histograms of the range (Z coordinates) of all the points within these two semicircles are calculated.

Figure 4 shows the histograms for the smiling and the neutral faces of the subject in Figure 3. The two figures in the first row are the histograms of the range

3

values for the left cheek and right cheek of the neutral face image; the two figures in the second row are the histograms of the range values for the left cheek and right cheek of the smiling face image.

Figure 4- Histogram of range of cheeks (L &R) for neutral (top row), and smiling (bottom row) face.

From the above figures, we can see that the range histograms of the neutral and smiling expressions are different. The smiling face tends to have large values at the high end of the histogram because of the bulge of the cheek muscle. On the other hand, a neutral face has large values at the low end of the histogram distribution. Therefore two features can be obtained from the histogram.

One is called the ‘histogram ratio’, represented by hr, the other is called the ‘histogram maximum’, represented by hm.

ℎ = ℎ6 + ℎ7 + ℎ8 + ℎ9 + ℎ10ℎ1 + ℎ2 + ℎ3 + ℎ4 + ℎ5hm = i; i = arg {max (h (i))}

After the six features have been extracted, this becomes a general classification problem. Two

pattern classification methods are applied to recognize the expression of the incoming faces. The first method used is a linear discriminant (LDA) classifier, which seeks the best set of features to separate the classes. The other method used is a support vector machine (SVM).

V. 3D FACE RECOGNITION

A. Neutral face recognitionIn our earlier research work, we have found that the

central vertical profile and the contour are both discriminant features for every person. Therefore, for neutral face recognition, the results of central vertical profile matching and contour matching are combined. The combination of the two classifiers improves the overall performance significantly. The final similarity score for the probe image is the product of ranks for each of the two classifiers (based on the central vertical profile and contour). The image with the smallest score in the gallery will be chosen as the matching face for the probe image.

B. Smiling face recognitionFor the recognition of smiling faces we have

adopted the probabilistic subspace method proposed by B. Moghaddam et al. [8,9]. It is an unsupervised technique for visual learning, which is based on density estimation in high dimensional spaces using Eigen decomposition. Using the probabilistic subspace method, a multi-class classification problem can be converted into a binary classification problem.In the experiment for smiling face recognition,because of the limited number of subjects (30), the central vertical profile and the contour are not used directly as vectors in a high dimensional subspace. Instead, they are down sampled to a dimension of 17 to be used. The dimension of difference in feature space is set to be 10, which contains approximately 97% of the total variance. The dimension of difference from feature space is 7.

In this case also, the results of central vertical profile matching and contour matching are combined, improving the overall performance. The final similarity score for the probe image is the product of ranks for each of the two classifiers. The image with the smallest score in the gallery will be chosen as the matching face for the probe image.

VI. EXPERIMENTS AND RESULTS

One gallery and three probe databases were used for evaluation. The gallery database has 30 neutral faces, one for each subject, recorded in the first data acquisition session. Three probe sets are formed as follows: Probe set 1: 30 neutral faces acquired in the second session.Probe set 2: 30 smiling faces acquired in the second session.Probe set 3: 60 faces, (probe set 1 and probe set 2).

0100200300

a b c d e f g h i j

Series1

0

200

a b c d e f g h i j

Series1

0

100

200

300

a b c d e f g h i j

Series1

050

100150

a b c d e f g h i j

Series1

Experiment 1: Testing the expression recognition module

The leave-one-out cross validation method is used to test the expression recognition classifier. Every time, the faces collected from 29 subjects in both data acquisition sessions are used to train the classifier and the four faces of the remaining subject collected in both sessions are used to test the classifier.classifiers are used. One is the linear discriminant classifier; the other is a support vector machine classifier. LDA tries to find the subspace that best discriminates different classes by maximizing the between class scatter matrix, while minimizing the within-class scatter matrix in the projective subspace. Support vector machine is a relatively new technology for classification. It relies on preprocessing the data to represent patterns in a high dimension, typically much higher than the feature space. With an appropriate nonlinear mapping to a sufficiently high dimension, data from two categories can always be separated by a hyper plane.

Table 1- expression recognition results

Method LDAExpression recognition rate 90.8

Experiment 2: Testing the neutral and smiling recognition modules separately

In the first two sub experiments, probe faces are directly fed to the natural face recognition module. In the third sub experiment, the leave-onevalidation is used to verify the performance of the smiling face recognition module.

a. Neutral face recognition: probe set 1.(neutral face recognition module used.)

b. Natural face recognition: probe set 2(neutral face recognition module used.)

c. Smiling face recognition: pro2(smiling face recognition module used).

From Figure 5, it can be seen that when the incoming faces are all neutral, the algorithm which treats all the faces as neutral achieves a very high recognition rate.

Figure 5 Results of Experiment 2(three sub-experiments)

0

0.5

1

1.5

a b c

rank 1 recognition rate


Experiment 1: Testing the expression recognition

out cross validation method is used e expression recognition classifier. Every

time, the faces collected from 29 subjects in both data acquisition sessions are used to train the classifier and the four faces of the remaining subject collected in both sessions are used to test the classifier. Two classifiers are used. One is the linear discriminant classifier; the other is a support vector machine classifier. LDA tries to find the subspace that best discriminates different classes by maximizing the

minimizing the class scatter matrix in the projective subspace.

Support vector machine is a relatively new technology for classification. It relies on pre-processing the data to represent patterns in a high dimension, typically much higher than the original feature space. With an appropriate nonlinear mapping to a sufficiently high dimension, data from two categories can always be separated by a hyper plane.

SVM92.5

Experiment 2: Testing the neutral and smiling

In the first two sub experiments, probe faces are directly fed to the natural face recognition module. In

one-out cross used to verify the performance of the

Neutral face recognition: probe set 1.(neutral face recognition module used.)Natural face recognition: probe set 2(neutral

Smiling face recognition: probe set 2(smiling face recognition module used).

From Figure 5, it can be seen that when the incoming faces are all neutral, the algorithm which treats all the faces as neutral achieves a very high

experiments)

On the other hand, if the incoming faces are smiling, then the neutral face recognition algorithm does not

These experiments emulate a realistic situation in which a mixture of neutral and smiling faces (probe set 3) must be perform well, only 57% rank one recognition rate is obtained. (Rankone means only the face which scores highest is selected from the gallery. Rank one recognition rate is the ratio between number of faces correctly recognized and the number of probe faces. Rank three means three highest scored faces instead of one face are selected.) In contrast, when the smiling face recognition algorithm is used to deal with smiling faces, the recognition rate can be as high as 80%.

Experiment 3: Testing a practical scenario

These experiments emulate a realistic situation in which a mixture of neutral and smiling faces (probe set 3) must be recognized. Sub experiment 1 investigates the performance obtained if the expression recognition front end is bypassed, and the recognition of all the probe faces is attempted with the neutral face recognition module alone. The last two sub experiments implement the full framework shown in Figure 1. In 3.2 the expression recognition is performed with the linear discrimwhile in 3.3 it is implemented through the support vector machine approach.

a. Neutral face recognition module used alone: probe set 3 is used.

b. Integrated expression and face recognitioprobe set 3 is used. (Linear discriminant classifier for expression recognitio

c. Integrated expression and face recognition: probe set 3 is used.(support vector machine for expression recognition.)

It can been seen in Figure 6 that if the incoming faces include both neutral faces and smiling faces, the recognition rate can be improved about 10 percent by using the integrated framework proposed here.

CONCLUSION

The work reported in this paper represents an attempt to acknowledge and account for the presence of expression on 3D face images, towards their improved identification. The method introduced here is computationally efficient. Furthermore, this method also yields as a secondary result the information of the expression found in the faces.Based on these findings we believe that the acknowledgement of the impact of expression on 3D face recognition and the development of systems that

rank 1 recognition

rank 3 recognition

4

On the other hand, if the incoming faces are smiling, then the neutral face recognition algorithm

These experiments emulate a realistic situation in which a mixture of neutral and smiling faces (probe

well, only 57% rank one recognition rate is obtained. (Rankone means only the face which scores highest is selected from the gallery. Rank one recognition rate is the ratio between number of faces correctly recognized and

ree means three highest scored faces instead of one face are selected.) In contrast, when the smiling face recognition algorithm is used to deal with smiling faces, the recognition rate can be as high as 80%.

Experiment 3: Testing a practical scenario

ese experiments emulate a realistic situation in mixture of neutral and smiling faces (probe

recognized. Sub experiment 1 investigates the performance obtained if the expression recognition front end is bypassed, and the

of all the probe faces is attempted with the neutral face recognition module alone. The last two sub experiments implement the full framework shown in Figure 1. In 3.2 the expression recognition is performed with the linear discriminant classifier,

n 3.3 it is implemented through the support

Neutral face recognition module used alone:

d expression and face recognition: probe set 3 is used. (Linear discriminant classifier for expression recognition.)Integrated expression and face recognition: probe set 3 is used.(support vector machine

It can been seen in Figure 6 that if the incoming faces include both neutral faces and smiling faces,

be improved about 10 percent by using the integrated framework proposed

The work reported in this paper represents an attempt to acknowledge and account for the presence of expression on 3D face images, towards their

identification. The method introduced here is computationally efficient. Furthermore, this method also yields as a secondary result the information of the expression found in the faces.Based on these findings we believe that the

act of expression on 3D face recognition and the development of systems that

account for it, such as the framework introduced here, will be keys to future enhancements in the field of 3D Automatic Face Recognition.

.

REFERENCES

[1] K. Bowyer, K. Chang, and P. Flynn, “A Survey of Approaches to 3D and Multi-Modal 3D+2D Face Recognition,” Conf. o Pattern Recognition, 2004.

[2] R. Chellappa, C.Wilson, and S. Sirohey, “Human and Machine Recognition of Faces: A Survey,” Proceedings of the IEEE, 83(5): pp. 705-740.

[3] www.polhemus.com.

[4] C. Li, A.Barreto, J. Zhai and C. Chin. “Exploring Face Recognition Using 3D Profiles and Contours,” IEEE SoutheastCon 2005. Fort Lauderdale.

[5] P.Ekman, W. Friesen, “Constants across cultures in the face and emotion,” Journal of Personality and Social Psychology1971. 17(2): pp. 124-129

[6] Y. Hu, D. Jiang, S. Yan, L. Zhang, and H. Zhang, "Automatic 3D Reconstruction for Face Recognition," presented at International conference on automatic face and gesture recognition, Seoul, 2004.

[7]"Notredame 3D Face Database, "http://www.nd.edu/~cvrl/.

[8].B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning for Object Detection,” International Conference of Computer Vision (ICCV' 95), 1995.

[9]B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning for Object Representation,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 1997. 19(7): pp. 696-710.

00.20.40.60.8

11.2

a b c



account for it, such as the framework introduced here, will be keys to future enhancements in the field

and P. Flynn, “A Survey of Approaches Modal 3D+2D Face Recognition,” IEEE Intl.

[2] R. Chellappa, C.Wilson, and S. Sirohey, “Human and Machine Proceedings of the IEEE, 1995.

C. Li, A.Barreto, J. Zhai and C. Chin. “Exploring Face IEEE SoutheastCon

“Constants across cultures in the face ” Journal of Personality and Social Psychology,

[6] Y. Hu, D. Jiang, S. Yan, L. Zhang, and H. Zhang, "Automatic 3D Reconstruction for Face Recognition," presented at

conference on automatic face and gesture

[7]"Notredame 3D Face Database, "http://www.nd.edu/~cvrl/.

[8].B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning International Conference of Computer

[9]B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning for ,” IEEE Trans. on Pattern Analysis and



5


SIP0112-1

Performance Evaluation of Signal Selective DOA Tracking for

wideband cyclostationary sources

SANDEEP SANTOSH1, O.P.SAHU

2, MONIKA AGGARWAL

3

Astt. Prof., Department of Electronics and Communication Engineering1

,

Associate Prof., Department of Electronics and Communication Engineering2

,

National Institute of Technology , Kurukshetra, 1,2

Associate Prof., Centre For Applied Research in Electronics (CARE)3 ,

Indian Institute of Technology, New Delhi.3

INDIA

[email protected] http://www.nitkkr.ac.in

Abstract

In this paper ,we present a new signal-

selective direction of arrival (DOA) tracking

algorithm for moving sources emitting

narrowband or wideband cyclostationary

signals. Here, the DOAs of the sources are

updated recursively based on most current

array output in a way that no data

association is needed.The interference and

noise are suppressed by exploiting

cyclostationarity .Only, the sources of

interest are tracked.The tracking

performance of this algorithm can be

improved via the kalman filter.

Index Terms – Array signal processing,

cyclostationarity, direction of arrival

tracking.

1. Introduction

Direction of arrival (DOA) tracking of

multiple moving sources has been a central

research topic on signal processing for

decades ,due to its wide applications such as

survellience in military applications and air

traffic control in civilian applications. One

obvious method of DOA tracking is to first

find DOAs by an existing DOA estimation

algorithm for each time frame on

assumption that directions do not change

within each time frame ,then to associate

each of newly estimated DOAs to those

previous estimates in order to keep tracking

the DOA changes and source movement . A

major problem of this method is that data

association , or correctly assigning the

estimated DOAs

at each time frame to their corresponding

previous estimates to form DOA tracks,

requires extensive computations. Data

association involves searching over I

possible combinations between the

estimated DOAs and the targets, where I is

supposed to be number of DOAs[1]. Some

DOA tracking algorithms which do not

require data association have been proposed

such as [1]-[5].The authors of [1] obtain the

current DOA estimates of the sources by

minimizing the norm of an error matrix

function, based on a covariance matrix

related to an array output at the current time

frame. The authors of [2] track the source

movement by estimating DOA changes for

each time frame ,rather than new DOAs


http://www.nitkkr.ac.in/


SIP0112-9

through solving a least squares (LS)

problem. The authors of [3] improve the

performance of [2] by employing the source

movement model and refining the updated

DOAs through a Kalman filter. The authors

of [4] update the DOA estimates of each

time frame by solving a maximum–

likelihood (ML) problem of most current

array output. This approach also employs a

source movement model and refines the

DOA estimates through a Kalman filter as in

[3].The authors of [5] introduce multiple

target states (MTS) to describe the target

motion ,and the DOA tracking is

implemented through updating the MTS by

maximizing the likelihood function of the

array output. Whether by LS or ML method,

whether introducing MTS or other models to

describe the target motion , whether using

Kalman filter or not ,all these algorithms

implement the DOA tracking in a way that

the order of the estimated DOAs for

different times or time frames is maintained

, thus data association is avoided. Therefore,

they are more computationally efficient than

the methods requiring the data association.

All the above methods are applicable to

narrowband signals and they would fail for

wideband signals .Wideband signals are

becoming more and more common

nowdays. Therefore, research work on

developing DOA tracking algorithms that

work for wideband sources has been carried

out[6]-[8].The authors of [6] use focusing

matrices to align steering vectors of different

frequency bins to carrier frequency so that

wideband signals can be treated the same

way as narrowband signals in estimating the

DOAs by multiple signal

classification(MUSIC)[9].When new data

arrive ,[6] first updates the focusing matrices

and then applies MUSIC to obtain new

estimated DOAs. In [7], the authors estimate

the DOAs of each time frame by an ML

approach. For multiple targets both [6] and

[7] require data association. In [7], the data

association is done by Bayes classifier

which is computationally expensive. The

authors of [8] develop two computationally

simple methods for DOA tracking based on

recursive expectation and maximization

(REM) algorithm. These two methods apply

for both narrowband and wideband signals .

From [8], the first method does not work

properly when two DOAs are crossing , and

the second method requires a linear DOA

motion model, restricting DOA tracks to

only straight lines.

Recently,a statistical property,

cyclostationarity, which many type of man

made signals in communications such as

BPSK,FSK,AM exhibit has been exploited

in DOA estimation[10]-[12].By exploiting

cyclostationarity, interference and noise that

do not share the same cycle frequency as the

desired signals or do not exhibit

cyclostationarity can be suppressed ,thus

performance of DOA estimation is improved

when the DOA of interference is close to

DOA of desired signal. The

Cyclostationarity could be exploited to

improve performance of DOA tracking. All

the DOA tracking algorithms discussed

previously [1]-[7] assume that the signals

are stationary but not cyclostationary .Here

,a new signal selective DOA tracking

algorithm for wideband multiple moving

sources by exploiting the cyclostationarity

of the signals is proposed .In this algorithm ,

the signals emitted by moving sources can

be either narrowband or wideband

cyclostationary. Our algorithm assumes that

DOAs in each time frame are fixed and

tracks the DOA changes from frame to

frame by exploiting the difference of

averaged cyclic cross correlation of the

array output. DOA tracking is initiated by

applying once a wideband DOA estimation


SIP0112-9

method : averaged cyclic

MUSIC(ACM)[12]. Then , the DOA

changes for each time frame are estimated

by finding the minimum solution to an LS

cost function related to averaged cyclic cross

correlation of the array output. Similar to

[12], averaging the cyclic correlation

enables wideband application .In order to

avoid inconsistent solutions for DOA

changes when the DOAs are crossing ,the

proposed cost function also includes a

regularization term that reflect the

assumption that sources are moving at

constant speeds . Similar to [1]-[5] ,our

signal selective DOA tracking algorithm

does not require data association . Also ,

incorporation of a Kalman filter into our

signal selective DOA tracking algorithm is

presented. Via the Kalman filter, the

tracking performance of our algorithm is

improved .The effectiveness of the proposed

algorithm is demonstrated by simulations.

1. Cyclostationarity

and Data model

A. Cyclostationarity

Given a signal s(t) , the cyclic correlation is

defined as [15],

rα

ss(τ)=‹s(t+τ/2)s*(t-τ/2)e-j2παt

› (1)

where (.)* denotes complex conjugate and

‹.›denotes time average . s(t) is said to be

cyclostationary if rα

ss(τ) is not zero at some

delay τ and some cycle frequency α. Many

man made communication signals exhibit

cyclostationarity due to modulation ,periodic

gating etc. They usually have cycle

frequency at twice the carrier frequency .or

multiples of the baud rate or combination of

these. For a given signal vector z(t),we can

calculate the cyclic correlation matrix as

[10].,

Rα

zz(τ)=‹z(t+τ/2)zH(t-τ/2)e

-j2παt› (2)

Where [.]H denotes Hermitian transpose.

B. Data Model

Consider the tracking problem by a uniform

linear array with N identical elements. I

moving sources are assumed to generate I

signals with cycle frequency α impinging on

the array. These signals are considered as

signal of interest(SOI). Other signals from

other moving sources which do not exhibit

cyclostationarity or have different cycle

frequencies are considered as interference.

Take the first antenna as reference ,then the

signal received by the nth antenna in the

array is ,

Zn(t)=Σi=1I

si(t +(n-1)Δi(t))ej2πfo(n-1)Δi(t) +

ηn(t)

(3)

Where si(t) is the complex baseband signal

of the ith signal of interest (SOI) induced at

the first antenna, fo is the carrier frequency

and Δi(t)=dsinθi(t)/c is the time delay

between two adjacent antennas. Here, θi(t) is

the impinging direction of the ith SOI at

time t, d is the intersensor spacing of the

uniform linear array, c is the propagation

speed. Note that ηn(t) has two components

:interference and noise induced at the nth

antenna. Interference and noise are assumed

to be cyclically uncorrelated with SOI.

Therefore, ηn(t) is neglected.

Now, assume that the DOA of the sources

change little during the time frame of length

T i.e θi(t) or Δi(t) are constant during the

kth time frame [(k-1)T,KT] where k=1,..,K.

The total tracking time is assumed to be KT

seconds. We have Δi(k)=dsinθi(k)/c for the

kth time frame. Our tracking algorithm will


SIP0112-9

deal with the data samples collected during a

time frame.

rα

sisj (τ,k)= ∫k si(t + τ/2) sj*(t- τ/2) )e

-j2παtdt

(4)

Now let us define the following vectors and

matrices,

S(t)=[s1(t)……sI(t)]T

(5)

Z(t)=[z1(t)…zN(t)]T

(6)

A(f,k)=[a1(f,k),………aI(f,k)] (7)

ai(f,k) = [1,ej2πfΔi(k)

,., ej2πf(N-1)Δi(k)

]T

(8)

where [.]T

denotes matrix transpose ,s(t) is

the source signal vector ,z(t) is the received

signal vector, A(f, k) is the steering matrix

evaluated at the frequency f for the kth time

frame and ai(f,k)is the steering vector for the

ith SOI evaluated at f for the kth time frame.

2. LS Tracking

algorithm

We will first evaluate the averaged cross

cyclic correlations between signals received

at the first antenna and other antennas

during the kth time frame .These

correlations will be simplified as functions

of signal directions at this time frame .Based

on these functions, an LS method of tracking

the direction of sources will be discussed.

A. Averaged Cross–Cyclic correlation

and initial DOA estimation

For the kth time frame ,calculate the cross-

cyclic correlation of z1(t) and zn(t), where

zn(t) is the signal received at the nth

antenna., for n=2,..,N .Using (3) and (4) ,

we can obtain N-1 cross-cyclic correlations

estimated at the kth time frame,

rα

z1zn (τ,k) =∫kz1(t+τ/2)zn*(t- τ/2)) e

-j2παtdt

= Σi=1 I [ Σp=1

I r

αspsi (τ-(n-1) Δi(k),k)]. E

-

j2π(fo-(α/2))(n-1)Δi(k) (9)

Since evaluation of cyclic correlation will

only retain SOI, interference and noise are

ignored in (9).To eliminate the dependence

of rα

spsi (τ,k) on Δi(k) or on τ, we further

evaluate of rα

z1zn (τ,k)at different time delay

τ and average them to obtain an averaged

cross cyclic correlation between z1(t) and )

zn(t) at the kth time frame as,

‹rα

z1zn(k)›τ = Στ= τ1τ=τ2

rα

z1zn (τ,k)

=Σi=1I

[Σp=1I

‹ r

αspsi (k)›τ ]. e

-j2π(fo-

(α/2))(n-1)Δi(k) (10)

‹rα

spsi(k)›τ= Στ= τ1τ=τ2

rα

spsi (τ-(n-1) Δi(k),k)]

(11)

Thus for source signals ,si(t) ,

i=1,….,I,which normally have certain time

invariant characteristics, if the duration of a

time frame is long enough , then ‹rα

spsi(k)›τ

can be assumed to be independent of k. In

our simulations ,a 0.5 s time frame or 3200

snapshots of data samples give results. We

drop k and define ,

Ei=Σp=1I‹r

αspsi›τ (12)

In addition define,

gn(θ)=e-j2π(fo-(α/2))(n-1)dsinθ/c

(13)

Then, (10) can be written as ,

‹rα

z1zn(k)›τ=Σi=1IEign(θi(k)). (14)

To derive our algorithm we need to know Ei.

. First,we apply the signal selective DOA


SIP0112-9

estimation algorithm ACM[12] to estimate

the initial DOAs. The number of sources

emitting SOI are assumed to be known or

estimated by minimum description

length(MDL) criteria. ACM works for both

narrowband and wideband signals. A

summary of this algorithm is given below:

1. Estimate the cyclic correlation

matrix Rα

zz(τ,1) during the first time

frame.

2. Average Rα

zz(τ,1 )over τ.

3. Apply the singular value

decomposition (SVD) to Rα

zz(1 )τ

to estimate all DOAs of SOI for first

time frame i.e . θi(1) where i= 1,..,I

4. Obtain Δi(1)= dsinθi(1)/c .we have,

‹Rα

zz(1)›τ=A(fo+α/2,1)AH(fo-α/2,1)

‹Rα

ss(1)›τ (15)

‹Rα

ss(1)›τ=A(fo+α/2,1)† A

H(fo-α/2,1)

†

‹Rα

zz(1)›τ (16)

B. Recursive Direction Updating

The tracking algorithm can be developed as

follows.

θi(k)=θi(k-1)+θi~(k) (17)

gn(θi(k))=gn(θi(k-1))+∂ gn(θ )/ ∂θ│θ=θi(k-1)

θi~(k) (18)

∂gn(θ)/∂θ│θ=θi(k-1)=[-j2π(fo-α/2(n-1)dcosθi(k-

1)/c] (19)

‹rα

z1zn(k)›τ = ‹rα

z1zn(k-1)›τ + Σi=1Icn,i(k-1)

θi~(k) (20)

cn,i(k-1)=Ei∂gn(θ)/∂θ│θ=θi(k-1) (21)

rn(k)=‹rα

z1zn(k)›τ-‹rα

z1zn(k-1)›τ (22)

From (20) rn(k) can be written as,

rn(k)=[cn,1(k-1),..,cn,I(k-1)]Θ~(k) 23)

Θ~(k)=[θ1

~(k),…….,θI

~(k)]

T (24)

Stacking rn(k) for n=2,….,N, we obtain,

r(k)=[r2(k),..,rN(k)]T=C(k-1)Θ

~(k) (25)

The DOA changes Θ~(k) can be estimated

by solving the LS problem of,

r(k)=C^(k-1)Θ

~(k) (26)

Θ^(k)=Θ

^(k-1)+Θ

~(k) (27)

Θ~(k)=Θ

~(k-1) (28)

Now, define a revised LS cost function,

f(Θ~(k))=[C

^(k-1)[Θ

~(k) – r(k)]

H[C

^(k-

1)Θ~(k) – r(k)]+ [ Θ

~(k) -Θ

~(k-1)]

H۸(k)

[ Θ~(k) -Θ

~(k-1)] (29)

Θ~(k)=[C

^H(k-1)C

^(k-1)+۸(k)]

-1[C

^H(k-

1)r(k)+۸(k)Θ~(k-1)] (30)

The computational complexity of the LS

tracking algorithm is O(NI),O(N,Ns,Na),

O(I2N) and O(I).

3. Kalman filter

In this section,we introduce a source

movement model ad apply a Kalman filter to

track the DOAs. The estimated DOAs by the

LS method are viewed as measurements of

DOAs in the Kalman filter model. The

current DOAs of the sources are first

predicted from previous DOAs using the

source movement model. Then , the

predicted DOAs are refined by the Kalman


SIP0112-9

filter. Our simulation shows that Kalman

filter refinement further improves DOA

tracking accuracy and reduces the burden of

selecting optimum ۸(k) in(30).

Define the state of the ith(i=1,…,I) source at

the kth time frame as,

xi(k) = [ θi(k) ]

[ θi˙(k) ]

[θi˙˙(k)] (31)

xi(k)=Fxi(k-1)+wi(k) (32)

yi(k)=Hxi(k)+vi(k) (33)

F = [ 1 T T2/2 ]

[ 0 1 T ]

[001] (34)

E[wi(j) wiH

(k)] = { Qi(k) , j=k }

{ 0, j ≠ k } for

i=1,…,I (35)

H=[100] (36)

ei(k)=xi^(k│k)-Fxi

^(k-1│k-1) (37)

εi(k)=θi^(k)–Hxi

^(k│k-1) (38)

Since both process noise and measurement

noise are assumed to be zero mean ,their

variance can be estimated by,

Qi^(k)=1/LΣj=k-L+1

kei(j)ei

H(j) (39)

σ2

yi(k)=1/LΣj=k-L+1kεi(j)εi

*(j) (40)

The steps to estimate DOAs for the kth time

frame are as follows:

1. Obtain the predicted state by xi^(k│k-1)

= F xi^(k-1│k-1).

2. Obtain θi^(k) by LS tracking method.

Use θi^(k-1│k-1) in place of θi

^(k-1).

3. Obtain Qi^(k-1) and σ

2yi (k) from (39) &

(40).Use Qi^(k-1)as an approximation of

Qi^(k).

4. Calculate Pi^(k│k-1)= F Pi

^(k-1│k-1)F

H

+ Qi^(k).

5. Calculate the Kalman filter gain G(k)=

Pi^(k│k-1) H

H/R(k) where

R(k)= H Pi^(k│k-1) H

H + σ

2yi (k).

6. Update the state for the kth time frame

by xi^(k│k)= xi

^(k│k-1)+ G(k)( θi

^(k) - H

xi^(k│k-1)).

7. Take the first element of xi^(k│k ) as

the refined DOA estimate for the kth time

frame, θi^(k│k) .

8. Prepare the next recursion by calculating

Pi^(k│k)= Pi

^(k│k-1) – G(k)H Pi

^(k│k-1).

4. Simulations

Tracking performance versus SNR.

In this simulation, three sources are assumed

to emit three wideband BPSK signals with

raised cosine pulse shaping. Two of them

are SOI with same baud rate 20 MHz and a

same carrier frequency 100 MHz. The other

is interference with a baud rate 6 MHz and a

carrier frequency 80 MHz. The cycle

frequency of SOI is 20 MHz, which is

assumed to be known. The two SOI are

coherent. A ULA with 7 antennas with

equal spacing of c/(2fo+α)= 1.36 m is used.

The subarray size is 6 for SS during

initialization .The duration of each time

frame is 0.5s during which 3200 snapshots

of data samples are obtained. The SNR of

one SOI is 1 db lower than other. The SNR

of the interference is 5 db lower than the

higher powered SOI. To see how the

performances of the LS method and the

Kalman filter method change with SNR, we


SIP0112-9

vary the SNR of high powered SOI from -5

db to 15 db.

Generally , source crossing poses difficulty

for tracking algorithm. The tracking

algorithm fails if the estimation error is so

large that the tracks of two crossing sources

are switched and lost as shown in fig 1. We

define failure rate as the ratio of number of

failed trials to the total number of trials

,which is 40 in our estimation.Fig2 shows

the failure rates of LS algorithm and Kalman

filter algorithm with respect to SNR.We can

see with the usage of a Kalman filter,failure

rate is lower than that with the LS method

and at and above 5 db SNR ,Kalman filter

method does not fail at all.

In this simulation ,we also plot the rms error

of the estimated DOA in fig 3. Consider

aspecific value of SNR; we can calculate

mean squared error of the estimated DOAs

for each trial of LS algorithm or Kalman

filter algorithm. Then, the root of the mean

of the mse obtained through all 40 trials is

what we call rms of estimated DOAs at this

certain SNR. We should note that if the

algorithm fails to track the sources at one

trial ,the mse for that trial will be large,it is

excluded from calculating the final rms. If

we ignore this value by not considering the

failed trial ,the final rms will tend to be

smaller than true value, not reflecting the

tracking failure.From fig3 we see that

Kalman filter method performs better than

the LS method.

Comparison of the estimated tracks with

real tracks of sources

In this simulation, we look into to see how

well the LS method and Kalman filter

method track the targets . The signals and

settings are same as in first simulation

except that SNR for both SOI are the same

and there is one more interference with a

baud rate of 6MHz and a carrier frequency

100MHz.whose SNR is also 5 db lower than

that of SOI.

We first assume that the SNR of the SOI is 5

db and runs both the LS method and Kalman

filter method 40 times. We assume that SNR

of SOI is 15 db and runs these two tracking

methods both for 40 times again. We plot

ensemble averages of estimated DOAs by

the LS method when SNR is 5 db in fig4.

Three other plots for the mean of the

estimated DOAs by the LS method when

SNR is 15 db and by Kalman filter

methodwhen SNR is 5 db and 15 dbare

similar and hence omitted. The comparisons

of the rms errors of the estimated DOAs by

our two algorithms is illustrated in fig 5 and

fig6 .for one SOI. It can be seen from these

plots that both methods track the DOAs of

the SOI well with Kalman filter method

outperforming the LS method in accuracy.


SIP0112-9


SIP0112-9

5. References

[1] C.R.Sastry,E.W.Kamen “An efficient

algorithm for tracking the angles of arrival

of moving targets” IEEE Trans. In signal

processing vol 39,no1,pp242-246,Jan 1991.

[2] C.K.Sword,M.Simaan and E.W.Kamen,

“ Multiple target angle tracking using sensor

array outputs”, IEEE Trans in Aerospace

Electronic Sys.,vol26,no2,pp 367-

373,March 1990.

[3] S.B.Park,C.S.Ryu,and K.K.Lee, “

Multiple target angle tracking algorithm

using predicted angles”,IEEE Trans in

Aerospace Electronic Sys.,vol 30 ,no 2,

pp643-648,April 1994.

[4] C.R.Rao,C.R.Sastry and B.Zhou , “

Tracking the direction of arrival of multiple

moving targets”, IEEE Trans in signal

processing vol 42, no.5,pp1133-1144,May

1994.

[5] Y.Zhou,P.C.Yip and H.Leung, “

Tracking the direction of arrival of multiple

moving targets by passive

arrays:algorithms”,IEEE Trans in signal

processing vol 47, no10, pp 2655-2666, Oct

1999.

[6] M.Cho and J.Chun, “ Updating the

focusing matrix for direction of arrival

estimation of moving sources”,in Proc Nat

Aero Electron Confer.Oct 2000, pp 723-727.

[7] A.Sathish and R.L.Kashyap , “

Wideband multiple target tracking”, in proc

IEEE Int Conf Acoustic,Speech,Signal

processing, April 2004,vol4,pp517-520.

[8] P.J.Chung,J.F.Bohme and A.O.Hero,

“Tracking of multiple moving sources using

recursive EM algorithms”, EURASIP J

applied signal processing vol 2005,no1,pp

50-60,2005.

[9] R.O. Schimdt, “ Multiple emitter

location and signal parameter estimation”,

IEEE Trans Antennas Propagation,volap-

34,no 3, pp 276-280,March 1986.

[10] W.A.Gardner , “ Simplification of

MUSIC and ESPIRIT by exploitation of

cyclostationarity”,Proc IEEE vol 76, no 7 pp

845-847, July 1988.


SIP0112-9

Bartlett Windowed fast computation of

discrete trigonometric transforms for real-time

data processing Abhijit Khare, Shubham Varshney, Vikram Karwal

{khareabhijit14, shubham7502909dece}@gmail.com, [email protected]

Department of Electronics and Communication

Jaypee Institute of Information Technology, Noida, India

Abstract- Discrete trigonometric transforms (DTT)

namely discrete cosine transform (DCT) and discrete

sine transform (DST) are widely used transforms in

image compression applications. Numerous fast

algorithms for rapid processing of real time data exist

in theory. Windowing is a technique where a portion

of the signal is extracted and its transform is

computed. These algorithms form a class of fast

update transform that uses less computation as

compared to computing transform using conventional

definition. Different windows such as rectangular,

split-triangular and sinusoidal windows have been

used in theory to sample the real time sequence and

their performance compared. In this research fast

update algorithm are analytically derived that are

capable of windowing the real time data in presence

of Bartlett window. Initially simultaneous update

algorithms are analytically derived and thereafter

algorithms capable of independently updating DCT

and DST are derived i.e. while computing the DCT

updated coefficients no DST coefficients are required

and vice-versa. The analytically derived algorithms

are implemented in C language to test their

correctness.

Keywords— Discrete trigonometric transform, window, fast update

I. INTRODUCTION

In the area of signal processing, transform

coding [8] provides an efficient way for

transmitting and storing data. The input data

sequence is divided into suitably sized blocks and thereafter reversible linear transforms are

performed. The transformed sequence has much

lower degree of redundancy than in the original

signal. Karhunen-Loéve Transform (KLT) [3] has

emerged as a benchmark for Markov-1 type

signals. The Discrete Cosine Transform (DCT)

[4,7] and the Discrete Sine Transform (DST)

perform quite closely to the ideal KLT and have

emerged as the practical alternatives to the ideal

KLT.

The DCT and DST have wide applications in signal and image processing for the purposes of

pattern recognition, data compression,

communication and several other areas [5]. Due to

their powerful bandwidth reduction capability the

DCT and DST algorithms are widely used for data

compression. DCT transforms a signal or image from the spatial domain to the frequency domain,

where much of the energy lies in the lower

frequencies coefficients like Discrete Fourier

Transform (DFT). The main advantage of the DCT

over the DFT is that DCT involves only real

multiplications. The DCT does a better job of

concentrating energy into lower order coefficients

than the DFT for image data. The DCT is adopted

as a standard technique for image compression in

JPEG and MPEG standards because of its energy

compaction property.

A portion of input signal is extracted using

windowing [6] and the transform of the windowed

contents is computed. These classes of algorithms

already exist in theory and are known as fast update

algorithms [2]. Different windows such as

rectangular, split-triangular, Hamming, Hanning

and Blackman windows have been used earlier to

sample the real time data and their performance

compared [6]. In this paper we have developed

update algorithm in the presence of Bartlett

window. Initially the algorithms are derived for simultaneous update of DCT/ DST coefficients, i.e.

we require to compute both the DCT and the DST

coefficients to find the updated DCT/ DST

coefficients. Thereafter algorithms are derived that

establish independence [1] between the DCT and

DST coefficients. These algorithms lead to easier

implementation of the update transform as we do

not need to compute both the coefficients

simultaneously.

Section I lists the introduction of Discrete

trigonometric transforms, windowed update algorithms and their advantages. Section II lists the

Bartlett window and DTT definitions.

Simultaneous Bartlett windowed update algorithms

are also derived in Section II. In Section III

independent update algorithms are derived. Section

IV includes the complexity calculations of the

derived algorithms and section V concludes the

paper.

mailto:shubham7502909dece%[email protected]

II. DCT/DST TYPE-II WINDOWED

SIMULTANEOUS UPDATE ALGORITHMS

USING BARTLETT WINDOW

A. Basic algorithms for DCT and DST

The DCT of a signal f(x) of length N is defined

by

𝐶 𝑘 = 2

𝑁𝑃𝑘 𝑓 𝑥 𝑐𝑜𝑠

2𝑥 + 1 𝑘𝜋

2𝑁

𝑁−1

𝑥=0

(1)

for k=0,1....,N-1

where,

1

2 𝑖𝑓 𝑘 𝑚𝑜𝑑 𝑁 = 0

𝑃𝑘= 1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

The DST of the same can be written as:

𝑆 𝑘 = 2

𝑁𝑃𝑘 𝑓 𝑥 𝑠𝑖𝑛

2𝑥 + 1 𝑘𝜋

2𝑁

𝑁−1

𝑥=0

(2)

for k=1,2,...,N

B. Simultaneous Update Algorithm

This section lists the update equations for

Bartlett windowed update DCT/DST. The

windowed update algorithms for DCT/DST type-II [9] are derived as it is the most often used

transform. For the input signal f(x), x=0,1,.... ..,N-1,

and the Bartlett window w(x) of length N with tail-

length N/2 given by equation (4), the windowed

data is given by

𝑓𝑤 𝑥 = 𝑓 𝑥 𝑤 𝑥 (3)

Bartlett (or triangular) window of length N is

defined by:

2𝑥

𝑁 𝑓𝑜𝑟 𝑥 = 0,1, …… , 𝑁

w(x)=

𝑤 𝑁 − 𝑥 𝑓𝑜𝑟 𝑥 =𝑁

2+ 1, …… , 𝑁 − 1 (4)

When the new data point f(N) is available, f(0)

is shifted out and f(N) data point is shifted in. The updated sequence is represented by f(x+1) and the

shifted windowed data is given by:

𝑓𝑤(𝑛𝑒𝑤 )(𝑥) = 𝑓(𝑥 + 1)𝑤(𝑥) (5)

which can be rewritten as:

𝑓𝑤(𝑛𝑒𝑤 )(𝑥) = 𝑓(𝑥 + 1) 𝑤 𝑥 + 𝑤 𝑥 + 1 − 𝑤(𝑥 + 1)

𝑓𝑤(𝑛𝑒𝑤 )(𝑥) = 𝑓 𝑥 + 1 𝑤 𝑥 + 1

+𝑓 𝑥 + 1 𝑤 𝑥 − 𝑤(𝑥 + 1)

1

0 N/2 X

Fig. 1 Bartlett Window w(x)

Defining 𝑚(𝑥) = 𝑤(𝑥) −𝑤(𝑥 + 1), above equation can be written as:

𝑓𝑤 𝑛𝑒𝑤 (𝑥) = 𝑓 𝑥 + 1 𝑤 𝑥 + 1 + 𝑓 𝑥 + 1 𝑚(𝑥)

Therefore,

𝑓𝑤 𝑛𝑒𝑤 𝑥 = 𝑓𝑤 𝑥 + 1 + 𝑓𝑚 𝑛𝑒𝑤 𝑥 (6)

𝑓𝑜𝑟 𝑥 = 0, …… , 𝑁 − 1

Where,

𝑓𝑚 𝑛𝑒𝑤 𝑥 = 𝑓 𝑥 + 1 𝑚(𝑥)

and

𝑓𝑤 𝑥 + 1 = 𝑓 𝑥 + 1 𝑤(𝑥 + 1)

𝑓𝑚 𝑛𝑒𝑤 𝑥 can be rewritten as:

𝑓𝑚 (𝑛𝑒𝑤 )(𝑥) = 𝑓(𝑥 + 1) 𝑚 𝑥 + 1 − 𝑚 𝑥 + 1 + 𝑚 𝑥

i.e.,

𝑓𝑚 𝑛𝑒𝑤 𝑥 = 𝑓 𝑥 + 1 𝑚 𝑥 + 1

+𝑓 𝑥 + 1 𝑚 𝑥 −𝑚 𝑥 + 1 (7)

Now,

−4

𝑁 𝑖𝑓 𝑥 =

𝑁

2− 1

m(x)-m(x+1)= 4

𝑁 𝑖𝑓 𝑥 = 𝑁 − 1

0 all other x in 0,.....,N-1

𝑚 𝑥 −𝑚 𝑥 + 1 = −4

𝑁𝛿𝑥 ,

𝑁2−1

+4

𝑁𝛿𝑥 ,𝑁−1 (8)

Substituting the value of m(x)-m(x+1) from

equation (8) to equation (7), we get

𝑓𝑚 𝑛𝑒𝑤 (𝑥) = 𝑓𝑚 𝑥 + 1

+𝑓 𝑥 + 1 −4

𝑁𝛿𝑥 ,

𝑁

2−1

+4

𝑁𝛿𝑥 ,𝑁−1

𝑓𝑚 𝑛𝑒𝑤 (𝑥) = 𝑓𝑚 𝑥 + 1

+4

𝑁 −𝑓

𝑁

2 𝛿

𝑥 ,𝑁

2−1

+ 𝑓 𝑁 𝛿𝑥 ,𝑁−1 (9)

The windowed update version of fw(x) and

fm(x) for moving DCT/DST for Bartlett window is

represented by equations (6) and (9) respectively. In

equation (6), fw(x+1) represents non-windowed

update of fw(x) and the second term fm(new)(x) is a

correction factor that converts this non-windowed update of fw(x) into an update in the presence of the

window. Similarly in equation (9), fm(x+1)

represents non-windowed update of fm(x) and the

second term converts this into the update in the

presence of the window.

Taking DCT-II of equation (6) and equation (9)

yields:

𝐶𝑤 𝑛𝑒𝑤 𝑥 = 𝐶𝑤 𝑥 + 1 + 𝐶𝑚 𝑛𝑒𝑤 𝑥 (10)

𝐶𝑚(𝑛𝑒𝑤 ) = 𝐶𝑚 𝑥 + 1

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2 𝛿

𝑥 ,𝑁2−1

𝑁−1

𝑥=0

+ 𝑓(𝑁)𝛿𝑥 ,𝑁−1 𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋

2𝑁

Solving the above equation yields:

𝐶𝑚(𝑛𝑒𝑤 ) = 𝐶𝑚 𝑥 + 1

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2 𝑐𝑜𝑠

2(𝑁2 − 1) + 1 𝑘𝜋

2𝑁

𝑁−1

𝑥=0

+ 𝑓(𝑁)𝑐𝑜𝑠 2(𝑁 − 1) + 1 𝑘𝜋

2𝑁

𝐶𝑚 𝑛𝑒𝑤 = 𝐶𝑚 𝑥 + 1

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2 𝑐𝑜𝑠

𝑁 − 1 𝑘𝜋

2𝑁

𝑁−1

𝑥=0

+ 𝑓(𝑁)𝑐𝑜𝑠 2(𝑁 − 1) + 1 𝑘𝜋

2𝑁

Therefore,

𝐶𝑚 𝑛𝑒𝑤 = 𝐶𝑚 𝑥 + 1

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2 𝑐𝑜𝑠

𝑁 − 1 𝑘𝜋

2𝑁

𝑁−1

𝑥=0

+ 𝑓(𝑁)(−1)𝑘𝑐𝑜𝑠𝑘𝜋

2𝑁 (11)

𝑓𝑜𝑟 𝑘 = 0,…… , 𝑁 − 1

Equations (10) and (11) can be used to

calculate the simultaneous update of the moving

DCT for Bartlett window. Cw(x+1) is the non-

windowed DCT update of fw(x) calculated using

DCT simultaneous update equation for rectangular

window which is listed below [2], and Cm(x+1) is

the non-windowed DCT update of fm(x) calculated

using same equation. Clearly, it can be seen that

while performing the windowed DCT update, both

the coefficients of DCT and DST are required.

𝐶+ 𝑘 = 𝑐𝑜𝑠𝑟𝑘𝜋

𝑁𝐶 𝑘 + 𝑠𝑖𝑛

𝑟𝑘𝜋

𝑁𝑆(𝑘)

+ 2

𝑁𝑃𝑘 −1 𝑘𝑓 𝑁 + 𝑟 − 1 − 𝑥

𝑁−1

𝑥=0

− 𝑓(𝑟 − 1 − 𝑥) 𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋

2𝑁

𝑓𝑜𝑟 𝑘 = 0, …… , 𝑁 − 1

where, C+(k) represents the updated DCT

coefficients.

Similarly the DST update equation may be derived

and is:

𝑆𝑤 𝑛𝑒𝑤 𝑥 = 𝑆𝑤 𝑥 + 1 + 𝑆𝑚 𝑛𝑒𝑤 𝑥 (12)

𝑆𝑚 (𝑛𝑒𝑤 ) = 𝑆𝑚 𝑥 + 1

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2 𝑠𝑖𝑛

𝑁 − 1 𝑘𝜋

2𝑁

𝑁−1

𝑥=0

+ 𝑓(𝑁)(−1)𝑘𝑠𝑖𝑛𝑘𝜋

2𝑁 (13)

𝑓𝑜𝑟 𝑘 = 0, …… , 𝑁 − 1


calculate the simultaneous update of the moving

DST for Bartlett window. Sw(x+1) is the non-

windowed DST update of fw(x) calculated using

DST update equation for rectangular window

which is listed below [2], and Sm(x+1) is the non-

windowed updated DST of fm(x) calculated using

the same equation. Clearly, it can be seen that

while performing the windowed DST update both the coefficients of DST and DCT are required.

𝑆+ 𝑘 = 𝑐𝑜𝑠𝑟𝑘𝜋

𝑁𝑆 𝑘 − 𝑠𝑖𝑛

𝑟𝑘𝜋

𝑁𝐶(𝑘)

+ 2

𝑁𝑃𝑘 −1 𝑘𝑓 𝑁 + 𝑟 − 1 − 𝑥

𝑁−1

𝑥=0

− 𝑓(𝑟 − 1 − 𝑥) 𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋

2𝑁

where, S+(k) represents the updated DST

coefficients.

III. DCT/DST TYPE-II WINDOWED INDEPENDENT

UPDATE ALGORITHMS USING BARTLETT

WINDOW

A. Independent Update Algorithm

Above mentioned equations (10) and (11) can

be used to calculate the independent update of the

moving DCT-II for Bartlett window. Cw(x+1) is

the non-windowed DCT-II update of fw(x), using

DCT independent update equation for rectangular

window which is listed below [2], and Cm(x+1) is

the non-windowed DCT-II update of fm(x) also

calculated using the same equation.

𝐶𝑤 𝑛 + 𝑟, 𝑘 = 2𝑐𝑜𝑠𝑟𝑘𝜋

𝑁𝐶 𝑛. 𝑘 − 𝐶 𝑛 − 𝑟, 𝑘

+ 2

𝑁𝑃𝑘𝑠𝑖𝑛

𝑟𝑘𝜋

𝑁 [𝑓 𝑛 −𝑁 − 𝑥 − 1

𝑟−1

𝑥=0

− −1 𝑘𝑓(𝑛 − 𝑥 − 1)] 𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋

2𝑁

+ 2


𝑟𝑘𝜋

𝑁 [

𝑟−1

𝑥=0

−1 𝑘𝑓 𝑛 + 𝑟 − 𝑥 − 1

−𝑓 𝑛 + 𝑟 − 𝑁 − 𝑥 − 1 ]𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋

2𝑁

− 2

𝑁𝑃𝑘𝑐𝑜𝑠

𝑟𝑘𝜋

𝑁 [ −1 𝑘𝑓 𝑛 − 𝑥 − 1

𝑟−1

𝑥=0

−𝑓 𝑛 −𝑁 − 𝑥 − 1 ]𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋

2𝑁

for k=0,1,......,N-1

When using the above equation to

calculate the non-windowed update we need the

current value C(n,k) and the previous value C(n-

1,k). The current and previous values in the case of

Cw are 𝐶 𝑓 𝑥 𝑤(𝑥) and 𝐶 𝑓 𝑥−1 𝑤(𝑥−1) respectively.

Since, the value of 𝐶 𝑓 𝑥−1 𝑤(𝑥−1) is not yet

available we need to derive it from𝐶 𝑓 𝑥−1 𝑤(𝑥)

which is available from the previous step. Similarly

for Cm, we need to calculate the correction factor

to compute 𝐶 𝑓 𝑥−1 𝑚 (𝑥−1) from 𝐶 𝑓 𝑥−1 𝑚 (𝑥) .

Similarly the analogous formulae for

DST-II are obtained by taking DST-II of equations

(6) and (9):

𝑆𝑤 𝑛𝑒𝑤 𝑥 = 𝑆𝑤 𝑥 + 1 + 𝑆𝑚 𝑛𝑒𝑤 𝑥 (14)

𝑆𝑚 (𝑛𝑒𝑤 ) = 𝑆𝑚 𝑥 + 1

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2 𝑠𝑖𝑛

𝑁 − 1 𝑘𝜋

2𝑁

𝑁−1

𝑥=0

+ 𝑓(𝑁)(−1)𝑘𝑠𝑖𝑛𝑘𝜋

2𝑁 (15)

𝑓𝑜𝑟 𝑘 = 0,…… , 𝑁 − 1


calculate the independent update of the moving

DST-II for Bartlett window. Sw(x+1) is the non-

windowed DST-II update of fw(x), using DST

independent update equation for rectangular

window which is listed below [2], and Sm(x+1) is

the non-windowed DST-II update of fm(x) also

calculated using the same equation.

𝑆𝑤 𝑛 + 𝑟, 𝑘 = 2𝑐𝑜𝑠𝑟𝑘𝜋

𝑁𝑆 𝑛.𝑘 − 𝑆 𝑛 − 𝑟, 𝑘

+ 2


𝑟𝑘𝜋

𝑁 [𝑓 𝑛 −𝑁 − 𝑥 − 1

𝑟−1

𝑥=0

− −1 𝑘𝑓(𝑛 − 𝑥 − 1)] 𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋

2𝑁

+ 2


𝑟𝑘𝜋

𝑁 [

𝑟−1

𝑥=0

−1 𝑘𝑓 𝑛 + 𝑟 − 𝑥 − 1

−𝑓 𝑛 + 𝑟 − 𝑁 − 𝑥 − 1 ]𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋

2𝑁

− 2

𝑁𝑃𝑘𝑐𝑜𝑠

𝑟𝑘𝜋

𝑁 [𝑓 𝑛 −𝑁 − 𝑥 − 1

𝑟−1

𝑥=0

− −1 𝑘𝑓 𝑛 − 𝑥 − 1 ]𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋

2𝑁

for k=1,......,N

When using the above equation to

calculate the non-windowed update we need the

current value S(n,k) and the previous value S(n-

1,k). The current and previous values in the case of

Sw are 𝑆 𝑓 𝑥 𝑤(𝑥) and 𝑆 𝑓 𝑥−1 𝑤(𝑥−1) respectively.

Since, the value of 𝑆 𝑓 𝑥−1 𝑤(𝑥−1) is not yet

available we need to derive it from𝑆 𝑓 𝑥−1 𝑤(𝑥)

which is available from the previous step. Similarly

for Sm, we need to calculate the correction factor to

compute 𝑆 𝑓 𝑥−1 𝑚 (𝑥−1) from 𝑆 𝑓 𝑥−1 𝑚 (𝑥) .

B. Computation for oldest time-step

The correction factor to calculate the

correct value C[f(x-1)w(x-1)] from C[f(x-1)w(x)]

for DCT update algorithm, and the correct value of

S[f(x-1)w(x-1)] from S[f(x-1)w(x)] are derived here

for the DST-II update algorithm.

𝑓 𝑥 − 1 𝑤 𝑥 = 𝑓(𝑥 − 1) 𝑤(𝑥) + 𝑤(𝑥 − 1) − 𝑤(𝑥 − 1)

= 𝑓 𝑥 − 1 𝑤 𝑥 − 1 − 𝑓(𝑥 − 1) 𝑤(𝑥 − 1) −𝑤(𝑥)

= 𝑓 𝑥 − 1 𝑤 𝑥 − 1 − 𝑓(𝑥 − 1)𝑚(𝑥− 1)

Therefore,

𝑓 𝑥 − 1 𝑤 𝑥 − 1 = 𝑓(𝑥 − 1)𝑤(𝑥)

+𝑓 𝑥 − 1 𝑚 𝑥 − 1 (16)

Calculating the correction factor to convert

𝑓(𝑥 − 1)𝑚(𝑥) into the correct value 𝑓(𝑥 − 1)𝑚(𝑥 − 1),

𝑓 𝑥 − 1 𝑤 𝑥 = 𝑓 𝑥 − 1 𝑤 𝑥 + 𝑤 𝑥 − 1 − 𝑤 𝑥 − 1

= 𝑓 𝑥 − 1 𝑚 𝑥 − 1 − 𝑓(𝑥 − 1) 𝑚(𝑥 − 1) −𝑚(𝑥)

= 𝑓 𝑥 − 1 𝑚 𝑥 − 1 − 𝑓(𝑥 − 1)𝑚𝑝(𝑥− 1)

Therefore;

𝑓 𝑥 − 1 𝑚 𝑥 − 1 = 𝑓 𝑥 − 1 𝑚 𝑥

+𝑓 𝑥 − 1 𝑚𝑝 𝑥 − 1 (17)

where,

−4

𝑁 𝑖𝑓 𝑥 =

𝑁

2− 1

m(x)-m(x+1)= 4

𝑁 𝑖𝑓 𝑥 = 𝑁 − 1

0 all other x in 0,.....,N-1

𝑓 𝑥 − 1 𝑚(𝑥 − 1) = 𝑓 𝑥 − 1 𝑚(𝑥)

+ 𝑓 𝑥 − 1 −4

𝑁𝛿𝑥 ,

𝑁2

+4

𝑁𝛿𝑥,0

i.e.

𝑓𝑚 𝑥 − 1 = 𝑓 𝑥 − 1 𝑚 𝑥

+4

𝑁 −𝑓

𝑁

2− 1 𝛿

𝑥 ,𝑁2

+ 𝑓 −1 𝛿𝑥,0 (18)

Taking DCT-II of equation (18)

𝐶𝑚𝑜𝑙𝑑 𝑘 = 𝐶 𝑓 𝑥−1 𝑚 𝑥−1

= 𝐶 𝑓 𝑥−1 𝑚 𝑥

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2− 1 𝛿

𝑥,𝑁2

𝑁−1

𝑥=0

+ 𝑓(−1)𝛿𝑥,0 𝑐𝑜𝑠 2𝑥 + 1 𝑘𝜋

2𝑁

for k=0,1,.....,N-1

Therefore,

𝐶𝑚𝑜𝑙𝑑 𝑘 = 𝐶 𝑓 𝑥−1 𝑚 𝑥−1

= 𝐶 𝑓 𝑥−1 𝑚 𝑥

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2− 1 𝑐𝑜𝑠

𝑁 + 1 𝑘𝜋

2𝑁

𝑁−1

𝑥=0

+ 𝑓(−1)𝑐𝑜𝑠𝑘𝜋

2𝑁 (19)

Taking DCT-II of equation (14)

𝐶 𝑓 𝑥−1 𝑤 𝑥−1 = 𝐶 𝑓 𝑥−1 𝑤 𝑥 + 𝐶 𝑓 𝑥−1 𝑚 𝑥−1 20

Equations (19) and (20) together can be used to calculate the older time sequence

windowed DCT-II values.

Taking DST-II of equation (18)

𝑆𝑚𝑜𝑙𝑑 𝑘 = 𝑆 𝑓 𝑥−1 𝑚 𝑥−1

= 𝑆 𝑓 𝑥−1 𝑚 𝑥

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2− 1 𝛿

𝑥,𝑁2

𝑁−1

𝑥=0

+ 𝑓(−1)𝛿𝑥,0 𝑠𝑖𝑛 2𝑥 + 1 𝑘𝜋

2𝑁

Therefore,

𝑆𝑚𝑜𝑙𝑑 𝑘 = 𝑆 𝑓 𝑥−1 𝑚 𝑥−1

= 𝑆 𝑓 𝑥−1 𝑚 𝑥

+ 2

𝑁𝑃𝑘

4

𝑁 −𝑓

𝑁

2− 1 𝑠𝑖𝑛

𝑁 + 1 𝑘𝜋

2𝑁

𝑁−1

𝑥=0

+ 𝑓(−1)𝑠𝑖𝑛𝑘𝜋

2𝑁 (21)

Taking DST-II of equation (16)

𝑆 𝑓 𝑥−1 𝑤 𝑥−1 = 𝑆 𝑓 𝑥−1 𝑤 𝑥 + 𝑆 𝑓 𝑥−1 𝑚 𝑥−1 (22)

Equations (21) and (22) together can be used

to calculate the older time sequence windowed DST-II values.

IV. COMPUTATIONAL COMPLEXITY

The algorithm developed is of computational

order N, whereas calculating the transform via fast

DCT/DST algorithms is of order Nlog2N.

V. CONCLUSION

New fast efficient algorithms that are capable

of updating the Bartlett windowed DCT and the

DST for a real time input data sequence are listed.

The windowed update algorithm aims at reducing the complexity in calculating DCT every time a

new value is introduced in the input. Initially

simultaneous Bartlett windowed update algorithms

for DCT/DST-II are developed and thereafter

independence is established between the update of

DCT and DST. The algorithms analytically

derived are verified using C language.

REFERENCES

[1] Karwal V, B.G. Sherlock, Y.P. Kakad, “Windowed DST-

independent discrete cosine transform for shifting data”.

Proceeding of 20th International Conference on Systems

Engineering, Coventry, U.K., Sept. 2009 pp. 252-257

[2] Karwal Vikram,” Discrete cosine transform-only and discrete

sine transform-only windowed update algorithms for shifting

data with hardware implementation,” Ph.D. Dissertation.

University of North Carolina at Charlotte, 2009, ISBN:

9781109343267.

[3] Ray W.D., Driver, R.M. “Further Decomposition of the

Karhunen-Loéve Series Representation of a Stationary Random

Process”, IEEE Trans., 1970, IT-16, pp 12-13.

[4] N. Ahmed, T. Natarajan, and K.R. Rao, "Discrete cosine

transform," IEEE Trans. Comput., vol. C-23, pp. 90-94, Jan.

1974.

[5] W.K. Pratt, Generalized Wiener "ltering computation

techniques, IEEE Trans. Comput. C-21 (July 1972) 636}641.

[6] Fedrick J. Harris, “On the Use of Windows for Harmonic

Analysis with the Discrete Fourier Transform”, Proceedings of

the IEEE, vol. 66, no. 1, January 1978

[7] P.Yip and K.R. Rao, "On the shift properties of DCT's and

DST's," IEEE Trans. Signal Processing, vol. 35, pp. 404-406,

Mar.1987.

[8] B.G. Sherlock, Y.P. Kakad, "Transform domain technique

for windowing the DCT and DST," Journal of the Franklin

Institute, vol. 339, Issue 1, pp. 111-120, April 2002.

[9] Jiantao Xi, Chicharo J.F.,” Computing running DCT’s and

DST’s based on their second order shift properties,” IEEE Trans.

On circuit and system-I, Vol. 47, No.5, 2000, pp 779-783.

[10] B.G. Sherlock, Y.P. Kakad,” Windowed discrete cosine and

sine transforms for shifting data”, Journal of signal processing,

Elsevier, Vol. (81) pp. 1465-1478.

[11] B.G. Sherlock, Y.P.Kakad, A. Shukla, “Rapid update of odd

DCT and DST for real-time signal processing,” Proc. Of SPIE

Vol. 5809 pp. 464-471. Orlando, Florida, March 2005.

L

2Plot

Abstractfilter argenerallyRecentlyperformschemesefficientscheme f

Index Tcomprespredictio

I.INTRO

B

A Bayersensors three coresultant

Fig:(1) B

Fig shocenter, inefficieintroduceventualcompresdemosaidesign computacan be computeimage co

Losslespre

Patil

1A 303No.-98, Sector

3MGM

t— In most drray images caly carried y it was comp the conve

s in terms oft reduction for Bayer filte

Terms—Bayer ssion, Greenon, Adaptive co

ODUCTION

BAYER COLO

r Filter color ain these camer

olors component image is refer

Bayer Patter ha

ows the Bayercompressed

ent in a way thece some relly be remssion step. Weicing digital c

and low ationally heavy

carried in aner. This motivompression sch

ss comedictio

Anita U1 , Dr

3 Joykung,Secr 3,Navi Mumb’s College of E

9819514

digital cameraaptured and dout before

mpression firstentional demf output imagbased lossless

er color images

Color filter an predictionolor difference

OR FILTER A

array usually cras to record onts at each pixerred to as a CF

as Red sample

r Patter has Rfor storage. e demosaicing edundancy w

moved in te do the compcameras can h

power cony process likn offline powvates the demhemes.

mpresson for r. Sudhirkuma

tor 56,Gurgaonbai (Thane) MaEngineering an4330,nareshkum

as Bayer colodemosaicing i

compressiont scheme ouosaicing firsge quality. Ans compressions proposed

array, Losslesn, Non-greene.

ARRAY

coated over thonly one of thel location. ThA image.

in center

Red sample inThen it waprocess alway

which shouldthe followingpression beforhave a simplensumption a

ke demosaicingwerful personamand of CFA

sion scbayer

ar D. Sawark

n,+91 9999860aharashtra,+91d Technology,mar.harale@mg

or is n. ut st n n

ss n

e e e

n as ys d g e

er as g al A

Fig 2: demosa

There asuch as

••

So nowmethod

•

•

•

•

A Predschemetwo sub(a) A sample(b) Noand blu This sy

••

chemer colorar2, Nareshku

0692,patilanita 9819768930 ,Kamothe, Navgmmumbai.ac

single sensoaicing and (b) C

II.PRESEN

are different ss

Lossy comprJPEG2000

w we have to lds.

Lossy schemdiscarding information.This schemcompressionlossless scheJPEG-2000 image but onattained. JPEG-2000 compress the

III. PROP

diction basede is proposed. b-images: green sub-imas of the CFA imn-green sub im

ue samples in th

ystem is mainly

Encoder Decoder

e basedr filterumar Harale3

[email protected] ,principaldmcevi Mumbai,+91.in

or camera imaCompression

NT SCHEMES

schemes presen

ression scheme

look the drawb

mes compress aits visual

me visually yn ratio as comemes.

is used to nly a fair perf

is very expene images.

POSED SCHE

d lossless CFIt divides a C

age which conmage mage which che CFA image

y consists of tw

d on r

[email protected]

aging chain (

USED

nt in the mark

e

backs of prese

a CFA Image blly redunda

yields a highmpared with th

encode a CFformance can b

nsive method

EME

FA compressioCFA images in

ntains all gree

contains the re.

wo parts

(a)

ket

ent

by ant

her he

FA be

to

on nto

en

ed

Encoder

Fig 3: St

Green SSubimagreferencthe nongdifferencprocessethe colosubimagscan sepredictiodependetwo susequentiof adapti

I

This prPredictioNon-gre

Predictio

As the predictioNow pronearest form a c

We cangreen pix

r:

tructure of prop

Subimage is coge follows bae and To redugreen subimace domain wed in the intensor difference

ge. Both subimequence withon technique ency. The predubimages areially with our ive Rice code.

IV. WORKING

roposed schemon on the greeneen plane.

on on the green

green plane ion and all preocessing a parprocessed neiandidate set

n find the dirxels it need som

posed scheme

oded first and ased on greenuce the spectrage is processe

whereas the gresity domain as

content of mages are proch context ma

to removediction residuee then entrproposed real

G OF THE SC

me is mainlyn plane and Pr

n plane

is raster scannediction errorsrticular green ighboring sam

rections assocme process.

the Non greenn subimage aral redundancyed in the colo

een subimage ia reference fothe nongreen

cessed in rasteatching basede the spatiae planes of thropy encodedization schem

CHEME

y working onrediction on th

ned during ths are recordedplane the fou

mples of g (i,j

iated with th

n as y, or is or n

er d al e d e

n e

e d. ur j)

e

Fig 4: green p

Let g(mranked Sg(mu,1<=u<=

If the directiowill bepredicti

i.e. {wheterog

i.e. {w1

FLOW GREEN

Adaptivplane When cdifferencolor spLet c(msamplincolor di d(m,n)g’(m,n)value

Four possiblpixel

mk,nk)Є Φg(i,candidates

,nu)) <= D=v<=4

directions oons of all greene considered ion of g(i,j) is

1,w2,w3,w4}=genous region a

1,w2,w3,w4}=

CHART FON PLANE

ve color differ

compressing thnce informatiopectral dependm,n) be the inng position(mifference of pix)=g’(m,n)-c(m,) à estimate

le directions

,j) for k=1,2,of sample g

D(Sg(i,j), Sg(

of g(i,j) is idn samples in Sin a homogen

={1,0,0,0} Elsand predicted v

={5/8,2/8,1/8,0

OR PREDICT

rence estimatio

he nongreen coon is exploitedency.

ntensity value m,n). Green-Rxel (m,n) is ,n) d green comp

associated wi

,3,4 be the fog(i,j) Э(Sg(i,jmv,nv) ) f

dentical to thSg(i,j), pixel (inous region an

se the g(i,j) is value of g(i,j) i

}

TION ON TH

on for non gree

olor plane, cold to remove th

at a non greeRed(Green-Blu

ponent intensi

ith

our j),

for

he ,j) nd

in is

HE

en

lor he

en ue)

ity

GH=(g(mGv=(g(m

Predictio

Color dic(i,j) wit

Where {Where kranked c CompresThe preimage, s

m,n-1)+g(m,n+m+1,n)+g(m+1

on on the non g

ifference predith color differe

{w1, w2, w3, wk is predictor candidate in Φc

ssion scheme ediction Error ay e(i, j) is giv

+1))/2 and ,n))/2

green plane

iction of a nonence value d(i,j

w4}={4/8, 2/8, coefficient d

c(i,j)

of pixel (i, jven by

n green samplj) is

1/8, 1/8} d(mk,nk) is kth

j) in the CFA

e

h

A

where gsample (i, j)

The enonnegdistribuone

The E(scannedis emplsimplicexponeis usedQuotien

R Where QuotienstorageThe Lej) is k d

Parameperformj) Optima

Where For a gcolor spand, hewhole sΜ is es

defined

g(i, j), d(i, j) avalue and the

error residue gative integer aution to an exp

(i, j) ’s from thd and coded wloyed to cocity and higentially distribd, each mappednt Q

parameter knt and Rem

e and transmissength of code wdependent and

eter k is cmance as it dete

al parameter K

geometric sourpaces I As lonence, the optimsource can be dstimated adapti

When codind to be

are respectivelycolor differenc

e(i, j) is thenas follows to rponential one f

he green sub-iwith Rice code ode E(i, j) bgh efficiencyuted sources W

d Residue E(i,

k is a non nainder are th

sion. word used for ris given by

ritical to thermines the co

is given by

is the goldrce with distribng as is μ knowmal coding pardetermined easvely in course

ng E(i, j) of

y, the real greece value of pix

n mapped to reshape its valufrom a Laplacia

image are rastfirst. Rice cod

because of iy in handlinWhen Rice cod

j) is split into

negative integhen saved f

representing E

he compressioode length of E

den ratio. bution parametwn, parameter ameter k for thsily. of Encoding

green plane

en xel

a ue an

ter de its ng de

o a

ger for

E(i,

on E(i,

ter ρ, he

is

When cobe

Decodin

DecodinEncodinthen thedecodedCFA Imtwo sub

Fig 5: St

From thea good predictiothe meadivisor adjustedof Rice c V. COMSimulatiperformabit coloraccordinCFA imthe prop Some resuch as LCMI w S No.

Image 1

oding E(i, j) of

ng Process:

ng Process isng. Green Sube non-green sud green sub-immage is then rec

images.

tructure of Dec

BITRAT

e above fig, it compression pon residue is aan of its value

used to gened accordingly socode.

MPRESSION Pions were caance of proposr images of sizng to the Bayer

mages. These Imosed compress

epresentative LJPEG-LS, JPE

were used for co

JPEG LS

5.467

f non green pla

s just reversb-image is decub-image is demage as a referconstructed by

coder�

TE ANALYSIS

shows that α =performance. Wa local variabl distribution a

erate the Riceo as to improv

ERFORMANCarried out tosed compressio

ze 512*768 wer pattern to formages are dirsion scheme fo

Lossless compreEG 2000(losslomparison of r

JPEG 2000 5.039

ane is defined to

se process ocoded first andcoded with thrence. Origina

y combining th

S

= 1 can providWe assume thle and estimatadaptively. The code is thene the efficiency

CE evaluate thon scheme. 24re sub-sampledrm 8 bit testingectly coded by

or evaluation.

ession schemeless mode) andresults

Proposed

4.803

o

of d e

al e

e e e e n y

e 4-d g y

es d

Image 2Image 3

If we aget impand also

α =0 α =0.6α =0.8α =1

ADV

We canand alssensorscomplegives b

VI. EX

2 6.188 3 6.828

alter the valueproved results o reduce the bi

Overall CRate (in bp4.9496 4.8486 4.8437 4.8366

VANTAGES O

n reduce the spso can get highs in digital cexity to designetter performan

XPERIMENTA

5.218 4.525

Table I

s of weightingin terms of co

it rates of CFA

CFA Bit p)

C

1.1.1.1.

Table-II OF PROPOSE

pectral redundh quality imagcameras from n. Compare wince.

AL RESULTS

4.847 3.847

g factor then wompression rat

A.

ompression Ra

6163 6496 6516 6537

ED METHOD

dancy mean timge. Reducing th

3 to 1. Loith JPEG2000

we tio

atio

me he

ow it

VII. CONCLUSION

CFA image encodes the sub-image separately with predictive coding Lossless prediction is carried out in the intensity domain for the green. While it is carried out in the color difference domain for the non green

VIII.ACKNOWLEDGMENT

The first author express his gratitude to the remaining two authors towards the completion this project.

IX REFERENCES

[1] S. Banks, Signal Processing, Image Processing and Pattern Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1990.

[2] S. P. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inf. Theory, vol. IT-28, no. 2, pp. 129–136, Mar. 1982.

[3] P. Berkhin, “Survey of clustering data mining techniques,” Accrue Software, San Jose, CA, 2002.

[4] J. Besag, “On the statistical analysis of dirty pictures,” J. Roy. Statist. Soc. B, vol. 48, pp. 259–302, 1986.

[5] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002.

[6] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, Aug. 2000.

[7] P. Felzenszwalb and D. Huttenlocher, “Efficient graph-based image segmentation,” Int. J. Comput. Vis., vol. 59, pp. 167–181, 2004.

[8] S. Zhu and A. Yuille, “Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, no. 9, pp. 884–900, Sep.1996.

[9] M. Mignotte, C. Collet, P. Pérez, and P. Bouthemy, “Sonar image segmentation using a hierarchical MRF model,” IEEE Trans. Image Process., vol. 9, no. 7, pp. 1216–1231, Jul. 2000.

[10] M. Mignotte, C. Collet, P. Pérez, and P. Bouthemy, “Three-class Markovian segmentation of high resolution sonar images,” Comput. Vis. Image Understand., vol. 76, no. 3, pp. 191–204, 1999.

[11] F. Destrempes, J.-F. Angers, and M. Mignotte, “Fusion of hidden Markov random field models and its Bayesian estimation,” IEEE Trans. Image Process., vol. 15, no. 10, pp. 2920–2935, Oct. 2006.

[12] Z. Kato, T. C. Pong, and G. Q. Song, “Unsupervised segmentation of color textured images using a multi-layer MRF model,” in Proc. Int. Conf. Image Processing, Barcelona, Spain, Sep. 2003, pp. 961–964.

[13] P. Pérez, C. Hue, J. Vermaak, and M. Gangnet, “Colorbased

probabilistic tracking,” in Proc. Eur. Conf. Computer Vision, Copenhagen,Denmark, Jun. 2002, pp. 661–675.

[14] J. B. Martinkauppi, M. N. Soriano, and M. H. Laaksonen, “Behavior of skin color under varying illumination seen by different cameras at different color spaces,” in Proc. SPIE, Machine Vision Applications in

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM

(SPRTOS)” MARCH 26-27 2011

SIP0201-1

OPTIMAL RECEIVER FILTER DESIGN

Vivek Kumar Dr. K. Raj Deptt. of Electronics Engg. Deptt. of Electronics Engg.

IITM. Kanpur Harcourt Butler Technological

5/414, Avas Vikas,Farrukhabad Institute, Kanpur – 208002, India

[email protected] [email protected]

Abstract

In wireless communication systems,

the pulse shaping filters are often

used to represent massage symbols

for transmission through channel &

its matched filter at the receiver

end. This paper deals with the

design & comparison of the optimal

receiver filter that maximize the

signal to interference plus noise

ratio of the received signal. The

first approach is based on

optimizing optimal matched filter

criterion and the second approach is

based on optimizing MMSE

criterion which provides a closed

form analytic solution for the filter

coefficient. In 3G and beyond 3G

systems, higher SIR of the received

signal is required so that higher

order modulation schemes can be

applied to achieve high data

transmission throughput and also

short tap length receiver filters in

order to reduce the power

consumption at the mobile units.

Simulation demonstrated that

receiver filter designed using

MMSE criterion can significantly

improve the system performance by

reducing inter symbol interference

in comparison to the optimal

matched filter.

Key words: MMSE, 3G, AWGN,

QAM, PSK,SIR, SINR

1. Introduction

The fundamental operation of

wireless communication systems is

to encode, modulate, up sample and

then transmits digital information

symbols in a form of analog

waveform through wireless

channel. This analog waveform are

the output of the transmit filters

which include pulse shaping filter,

phase equalizers and R.F. filters.

On the receiver side, the received

waveforms are filtered by receiver

filter which is normally matched to



SIP0201-2

the transmitter filter. The output of

the receiver filter is down sampled

then sent into demodulator &

decoded to recover the transmitted

information. Figure 1 shows a

simple communication link system.

s(n)

4X

S^(n) 4X

Figure 1. A communication

link system

Nyquist filters are commonly used

in data transmission systems for

pulse shaping. They have the

property that their impulse

responses have zero-crossing that is

uniformly spaced in time. If the

channel is an AWGN channel, the

Nyquist filter is an ideal pulse

shaping filter since it has an infinite

length of impulse response. The

most well known is using the root

Nyquist filter and its matched filter

which introduces no inter symbol

interference. A practical

approximation of the Nyquist filter

is the raised cosine filter. Several

methods [5,11,7,17,4] were

proposed to design general Fir

transmitter pulse shaping filters and

its matched filter that have

orthogonally property as the root

Nyquist filter. These methods can

not be applied to design optimal

receiver filters because these

methods are application specific

and require changing the transmitter

pulse shaping filter to eliminate the

interference which is not easy.

In 3G and beyond 3G systems,

higher order modulation schemes

such as 8-PSK,16-QAM are used to

increase the data transmission

throughput. These schemes require

higher SIR of the received signal so

that the transmission is reliable. The

receiver filter provides higher SIR

of the received signal and must

have short tap length so that we

have large noise margin & less

power consumption at the mobile

units. So the main targets of the

receiver filter design are following

(i) Maximizing SINR of the

received signal to use higher order

modulation schemes.

(ii) Receiver filter must have short

tap length for less power

consumption at the mobile units

g(i) Tx RF

f(i) Rx RF



SIP0201-3

To design optimal receiver filter,

we use two approaches, the first

approach is based on optimal

matched filter criterion which is

design using the optimal filter

design method and have same

property of the transmitter filter

such as band edges frequencies,

pass band and stop band ripples etc.

The second approach is based on

MMSE criterion which minimize

the error the transmitted signal and

the received signal. This approach

leads a closed form analytic

solution of the receiver filter

coefficients and can be extended to

design adaptive receiver filters.

Here optimally is in the sense of

maximizing signal to interference

plus noise ratio of the received

signal. Simulation demonstrated

that the receiver filter design using

MMSE approach can significantly

improve signal to interference plus

noise ratio (SINR) of the received

signal as compared to the receiver

filter design using optimal matched

filter.

2. Requirements for receiver

filter design

We consider the following

requirements for the design of the

optimal receiver filter.

[R1] We want to design a receiver

filter, whose impulse response is

f (i) and filter length is L means ,i

= 0,...,L − 1 , such that the

received signal s ˆ(i) has the

highest SINR. With given an FIR

transmitter filter, whose impulse

response is g ˆ (i) where i =0,..., N −

1.

[R2] Since the transmitted signal is

bandwidth limited, the side lobe of

the receiver baseband filter in stop

band means for frequency greater

than 740 kHz (f ≥ 740 kHz), there

should be sharp cut of less then −40

dB.

[R3] The wireless channel is

frequency non-selective and has

only one path.

However, we add the requirement

[R2] as a constraint on adjacent

channel interference power.

3. Optimal receiver filter design

In this section, we present two

approaches to design optimal

receiver filters that maximizing

signal to interference plus noise

ratio of the received signal. With

the receiver filter being the optimal

matched filter, the maximal signal

to noise ratio of the received signal



SIP0201-4

can be achieved but the signal to

interference ratio is not too good.

3.1 Optimal matched filter

approach

In the optimal design method, the

weighted approximation error

between the actual frequency

response & the desired filter

response is spread across the pass

band and stop band and the

maximum error is minimized. This

design method results ripples in

pass band & stop band. So the

frequency of the filter in the pass

band and the stop band

respectively.

1 - δp ≤ │H(ejω

)│≤ 1+δp

│ω│≤ ωp

-δs ≤ │ H(ejω

)│≤ δs

│ω│≥ ωs

Where, δp = pass band ripple & δs =

maximum attenuation in the stop

band.

The weighted approximation error

is defined as

E(ω) = W(ω) [Hd(ω) - H(ejω

)]

Where, Hd(ω) = desired frequency

response & H(ejω

) = actual

frequency response.

The filter parameters are

determined such that the maximum

absolute value of E(ω) is

minimized. By using the remez

exchange algorithm, we can design

a filter which has optimal set of

filter coefficients such that receiver

filter being matched [f(i) = g(N-i)]

to the transmitter filter.

3.2 MMSE approach

In this approach, we derive the

signal model of the received signal

and assume that the information

symbol sequence is white noise

random process. In the

communication link system as

shown in Fig.1, we assume that the

channel impulse response h (t) has

been estimated using a pilot signal

or using blind channel identification

algorithms.

If the impulse response of

transmitter filter and channel is g

which is represented by the

convolution of gˆ (t) and h (t) i.e. g

= gˆ ∗ h, where g has a finite

support on [0, N − 1]. Thus the

combined impulse response of the

transmitter, channel and receiver

baseband filtering result is denoted

by the convolution g ∗ f. The

impulse response of g ∗ f is



SIP0201-5

h^(k) =

1N

oi

g(i)f(k-i),

k=0,1,…………….N+L-1,

(1)

We require sum of length of

transmitter filter and length of

receiver filter is an even number i.e.

N+L is an even number and filter

has linear phase property. So we

can let,

h (k) = ĥ (k-(N+L-2))

Then h (k) is represented as

=

)1(

.

...)1(

)0(....

....

...

.)1(

)0(

Ng

Ng

g

g

g

)1(

.

.

.

)0(

Lf

f

h(k)=GF, (2)

Where G is a Toeplitz matrix of

g(k) and F is a vector of f(k).

After 4X down sampling at the

receiver, the received signal can be

represented as

ŝ (k) = h(0) s(k) + ]8/)2[(

0]8/)2[(

LN

iLNi

0(k) + nACI(k)

(3)

where the first term is the desired

signal, second term represents the

ISI, third term represents the noise

present in the received signal and

forth term represents the adjacent

channel interference on the right

hand side of equation (3).The

transmitted signal is s (k) and the

received signal is s^ (k). The mean

square error betweenthe transmitted

and the receiver signal is given by

Minimize: MSE = E [( s (k) - ŝ (k))

2] (4)

By equation (2), we have h (i) = Gi

F, where Gi is the i-th row of the

matrix G. We define a matrix Ĝ

made by the rows G 4i , where − [

(N + L − 2)/8 ]≤ i ≤ [(N + L − 2)/8

] and i ≠ 0.

Let the frequency response of

the receiver filter at frequency fi

be represented as

₣iF, where ₣i is a row of the

complex Fourier transform matrix ₣

corresponding to the frequency fi,

i.e.,



SIP0201-6

₣=

.)/)1(2exp(....)/2exp(1 FsfiLjFsfij

Fs is the sampling frequency. The

MSE in (4) is approximately

MSE ≈ ║ĜF- δ ║2

+ N0 ║F║2

+ λ

║₣^F║2 (5)

Where N0 is the power spectral

density of the channel noise, λ is

the flat power spectral density in

the adjacent frequency band,

δ = [0 … … 1 … … 0]T

,

₣^ = [₣j

T ₣j+1

T …………

₣MT]

T

fj = 740Hz to 3.4KHz(voice

frequency) & M is the number of

frequency sampling points.

Thus, the minimum mean square

error problem (16) becomes,

Minimize: ║ĜF- δ ║2 + N0

║F║2+λ║₣^F║

2 (6)

The receiver filter which minimize

the mean square of the estimation

error in (6) is

F=( ĜĜT + N0I + λ ₣

^₣

^T)-1

ĜT δ(7)

Where I is an identity matrix. This

analytic solution can be applied in

designing an adaptive receiver filter

with channel being estimated. In

designing the receiver filter, the

ACI power and the channel noise

are not known in advance. The

parameters λ and N0 can be adjusted

to meet the side lobe requirement

and to optimize the transition band

but the adjustment of these

parameters are not easy and not

straightforward.

4. Comparison between design

filters

The frequency response of the

optimal receiver filter design using

MMSE approach and using optimal

matched approach is shown in

figure 2. We observe that the

MMSE receiver filter has more flat

frequency response as compared to

the matched filter in the passband.

This frequency response is close to

the frequency response of the root

Nyquist filter which has a flat

frequency response in the passband.

We also observe that receiver filter

designed using MMSE approach

has high skirt than the filter

designed using the matched filter

approach.

Now we compare the receiver

filters designed by both approach

using a simple wireless

communication system. We assume

that wireless channel is frequency



SIP0201-7

non-selective and has only one

path. In 3G systems, we require to

transmit data rate as high as

possible. To increase the data

transmission throughput, we have

to use spectral efficient modulation

schemes such as 16- QAM. Due to

increasing the data throughput,

there is high ISI problem.

0 0.5 1 1.5 2 2.5-100

-80

-60

-40

-20

0

20

Frequency(MHz)

Norm

alized m

agnitude response(dB

)

Matched filterdata1

MMSE filter

Figure 2. Frequency response of

receiver pulse shaping filters

We know that eye diagram provides

a great deal of useful information

about the performance of a data

transmission system. The eye

diagram of using 48 tap optimal

matched receiver filter shows that

eyes are very small due to high ISI.

-0.5 0 0.5-4

-2

0

2

4

Time

Am

plit

ude

Eye Diagram for In-Phase Signal

-0.5 0 0.5-4

-2

0

2

4

TimeA

mplit

ude

Eye Diagram for Quadrature Signal

Figure 3. Eye diagram of

received signal using optimal

matched filter

Figure 4 shows the eye diagram of

receiver pulse shaping filter design

using MMSE approach. In this eye

diagram, we can resolve that 16-

QAM modulated signal can be

received reliably using this 48-tap

receiver filter. We can

examine receiver filter design

using MMSE approach is racier to

sampling time error of the received

signal and provides large noise

margin to the system as compared

to the receiver filter design using

optimal matched approach.



SIP0201-8

-0.5 0 0.5-4

-2

0

2

4

Time

Am

plit

ude

Eye Diagram for In-Phase Signal

-0.5 0 0.5-4

-2

0

2

4

Time

Am

plit

ude

Eye Diagram for Quadrature Signal

Figure 4. Eye diagram of

received signal using receiver

MMSE filter

We compare the optimal matched

receiver filter and optimal receiver

filter design using MMSE approach

for tap length L = 36, 48 and 64

respectively. From Table 1, we

observe MMSE receiver filter can

provide higher SINR but the cost

we bear for significantly gain of

SINR is a slight degradation of

SNR. We can achieve higher SINR

using matched filter with more

number of tap length but this

increase the power consumption at

mobile units. We can increase the

SNR of the received signal by

increasing power spectral density of

the channel noise in the MMSE

filter & when channel noise = ∞,

the MMSE filter is the same as the

optimal matched filter.

Table 1. Comparison between

optimal matched filter and

MMSE filter with different no. of

taps

5. Conclusion

In 3G and beyond 3G system,

higher SIR of the received signal is

required so that high order

modulation schemes such as 8-

PSK, 16-QAM can be applied from

which we can achieve high data

Number of

taps

36 48 64

SINR/SIN

Rmatched(d

B)

5.36

4

9.865 20.40

56

SNR/SNR

matched(dB)

-

0.51

71

-

0.225

0

-

0.353

4



SIP0201-9

transmission throughput. By

designing receiver filter using

approaches presented in section 3,

we can achieve high data

transmission throughput. The first

design method is based on optimal

matched criterion and the second

approach is based on optimizing an

MMSE criterion which provides an

analytic solution of filter

coefficients. By simulations, we can

demonstrate that there is

significantly improvement in

performance of using the optimized

receiver pulse shaping MMSE filter

over the optimal matched filter.

References

[1] N.C. Beaulieu, C.C. Tan, and

M.O. Admen. A “better than”

Nyquist pulse. IEEE

Communications Letters, 5(9):367

–368, September 2001.

[2] T. Berger and D.W. Tufts.

Optimum pulse amplitude

modulation part I: Transmitter-

receiver design and bounds from

information theory. IEEE Trans.

Information Theory, 13(2):196 –208, April 1967.

[3] J.O. Coleman and D.W. Lytle.

Linear programming techniques for

the control of intersymbol

interference with hybrid FIR/analog

pulse shaping. In IEEE Int. Conf.

Commun., Chicago, IL,

June 1992.

[4] T.N. Davidson, Z.Q. Luo, and

K.M. Wong. Design of orthogonal

pulse shapes for

communications via semide finite

programming. IEEE Trans. Signal

Processing, 48(5):1433–

1445, May 2000.

[5] H. Samueli. On the design of

FIR digital data transmission filters

with arbitrary magnitude

specifications. IEEE Trans.

Circuits Syst., 38(12):1563 –1567,

December 1991.

[6] L. Tong, G. Xu, and T.

Kailath. Blind identi fication and

equalization based on second-order

statistics: A time domain approach.

IEEE Trans. Information Theory,

40(2):340–349,

March 1994.

[7] J. Tuqan. On the design of

optimal orthogonal finite order

transmitter and receiver filters for

data transmission over noisy

channels. In Proc. of the 34th

Asilomar Conf. on Signals,

Systems and Computers, volume 2,

pages 1303 – 1307, October 2000.

[10] Haykin Simon. “Adaptive

filter theory” fourth edition ,

Pearson Education , Delhi, pp.

436-460.



SIP0201-10


SIP0202-1

Signal Acquisition and Analysis System Using

LabVIEW Subhransu Padhee, Yaduvir Singh

Department of Electrical and Instrumentation Engineering

Thapar University, Patiala, 147004, Punjab [email protected], [email protected]

Abstract- In the present era virtual instrumentation

technique is considered as a separate discipline of

engineering education. It has replaced the

conventional technique of measurement and data

acquisition and taken the instrumentation

experiment in to a new level. With easy to use,

graphical programming enabled software, supported

by dedicated, easy to use hardware virtual

instrumentation has transformed the notion of

engineering education and simulation based

experiments.

This paper gives a brief idea of the need and

advantages of virtual instrumentation in engineering

education and discusses the need of distant

laboratory in engineering education. It also

develops a simple application for signal acquisition,

analysis and storage.

Keywords- LabVIEW, virtual instrumentation

I. INTRODUCTION

Acquiring multiple data, the data may be analog

or discrete in nature from the field or process at

high speed using multi channel data acquisition

system, processing the data with the help of a data

processing algorithm and a computing device and

displaying the data for the user is the elementary

need of any industrial automation system [1,2,3,4].

Modern day process plants, construction sites,

agricultural industry [11], petroleum, wireless

sensor network [16], power distribution network

[17], refinery industry, renewable energy system

[10,28] and every other industry where data is of

prime importance use wireless data acquisition, data

processing and data logging equipments. Acquiring

data from the field with the help of different sensor

is always challenging. Different kinds of noises are

super imposed in the data which comes from the

field with the help of transducers and data

acquisition system. After acquiring data from the

field, the signal processing operation is performed.

In signal processing operation, different noises

which are super imposed in the original process

signal is removed and the signal is amplified so that

the signal keeps its original traits and the data

which comes with the signal remains intact. After

the signal processing part, the data is given to a data

processing algorithm which processes the data and

stores the data in a memory unit.

With the advantage of technology

personal computers with PCI, PXI/compact PCI,

PCMCIA, USB, IEEE 1394, ISA, VXI, serial and

parallel ports are used for data acquisition, test and

measurement and automation. Personal computers

are linked with the real world process with the help

of OPC, DDE protocol and application software is

used to form a closed loop interaction between the

real world process, application software and

personal computing unit. Many of the networking

technologies that have already been available for a

long time in industrial automation (e.g., standard

and/or proprietary field and control level buses),

besides having undertaken great improvements in

the last few years, have also been progressively

integrated by newly introduced connectivity

solutions (Industrial Ethernet, Wireless LAN, etc.).

They have greatly contributed to the technological

renewal of a large number of automation solutions

in already existing plants. Obviously, even the

software technologies involved in the

corresponding data exchange processes have been

greatly improved; as an example, today it is

possible to use a common personal computer in


SIP0202-2

order to implement even complex remote

supervisory tasks of simple as well as highly

sophisticated industrial plants.

This paper gives an overview of modern day

industrial automation system comprising of data

acquisition system and data loggers. This paper

develops a secured data acquisition and analysis

module using virtual instrumentation concept. With

the help of this system the operator can securely

login to the system and perform the desired signal

acquisition and analysis operation. The system also

stores the relevant data for future reference and

record keeping purpose.

II. INDUSTRIAL AUTOMATION SYSTEM

Most measurements begin with a transducer, a

device that converts a measurable physical quantity,

such as temperature, strain, or acceleration, to an

equivalent electrical signal. Transducers are

available for a wide range of measurements, and

come in a variety of shapes, sizes, and

specifications. Signal conditioning can include

amplification, filtering, differential applications,

isolation, simultaneous sample and hold (SS&H),

current-to-voltage conversion, voltage-to-frequency

conversion, linearization and more.

Figure 1: Block diagram of data acquisition and

logging

Figure 1 shows the schematic diagram of

data acquisition system. Sensor is used to sense the

physical parameters from the real world. The output

of the sensor is provided to the signal conditioning

element. The main purpose of signal conditioning

element is to remove the noise of the signal and

amplify the signal. The output of the signal

conditioning system is provided to ADC. The ADC

converts the analog signal to the equivalent digital

data. The equivalent digital data is then fed to the

computer, which acts both as a controller and

display element.

Once data has been acquired, there is a need

to store it for current and future reference. Today,

alternative methods of data storage embrace both

digital computer memory and that old traditional

standby-paper. There are two principal areas where

recorders or data loggers are used. Recorders and

data loggers are used in measurements of process

variables such as temperature, pressure, flow, pH,

humidity; and also used for scientific and

engineering applications such as high-speed testing

(e.g., stress/strain), statistical analyses, and other

laboratory or off-line uses where a graphic or

digital record of selected variables is desired.

Digital computer systems have the ability to

provide useful trend curves on CRT displays that

could be analyzed.

III. VIRTUAL INSTRUMENTATION IN DISTANT LAB

To improve the learning methodology in

different discipline in engineering virtual

instrumentation is used. This technique is easy to

use, easy to understand and cost effective. The main

feature is that various simulations can be performed

with the help of programming, which is very

difficult to perform in hardware. State of art virtual

instrumentation system has been reported in

literature which enhances the learning experience of

the students of different discipline. Some of the

discipline where state of art virtual instrumentation

system has been developed are mechanical

engineering [6], power plant training [8],

electronics [9], control system [12], chemical

engineering [14], ultrasonic range measurement

[20], biomedical [21,22], power system [23,24],

electrical machine [25], intelligent control [31].

Laboratories in engineering and applied science

have important effects on student learning. Most

educational institutions construct their own

laboratories individually. Alternatively, some

institutions establish laboratories, which can be

conducted remotely via internet. Different

researchers have proposed the concept of distant

laboratory [7, 18, 19] using internet [27], and using

intranet [26]. Researchers have proposed different

hardware and software architectures for remote


SIP0202-3

laboratories. General structure of a remote

laboratory is almost the same in every academic

research: Remote clients, a server computer

equipped with an IO module and remote

experimental setup connected to the server.

Figure 2: Architecture of remote laboratory

Figure 2 illustrates, the architecture of the

remote laboratory consists of a server computer

with an industrial network card. Since the network

card is plugged in a PCI slot, it is called PCI card. It

provides required protocol operations for controller

area network.

IV. CASE STUDY

This section develops a signal processing

application using LabVIEW. This application can

be used to teach students about basics of virtual

instrumentation and signal processing. With this

application the student can get a basic knowledge

about signal processing and perform different

applications oriented experiments using LabVIEW

without going in for CRO or DSO. Figure 3 shows

the front panel of the application.

Figure 3: Front panel of the system

This operator console consists of four buttons and

for security reason the operator has to login using

authenticated username and password to access all

other functionality of the system. Figure 4 shows

the login screen. This screen appears when the login

button is pressed.

Figure 4: Front panel of login screen for operator

Figure 5 shows the data acquisition module of the

system where there is control to set the desired

amplitude and frequency of the signal. Noise of

certain amplitude can be added with the signal. This

module shows both the time domain and frequency

domain representation of the noisy signal.


SIP0202-4

Figure 5: Front panel for time domain analysis of

the acquired noisy signal

Figure 5 shows the time domain

representation of the noisy signal where as figure 6

show the frequency domain representation of the

signal. Frequency domain representation involves

the Fourier analysis of the signal.

Figure 6: Front panel for frequency domain analysis

of the acquired signal

The third module of the system is the analysis

module. In this analysis module the operator can

select a certain portion of the signal using the

pointer available. The portion of the signal is

displayed in the subplot and DC value, RMS value,

average value and mean value of the portion of the

signal is displayed. Figure 7 shows the front panel

for waveform analysis.

Figure 7: Front panel for analysis of the subset of

signal

These results can be analyzed and logged to a file

for record keeping and further analysis.

V. CONCLUSIONS

This paper emphasizes on the data acquisition,

supervisory control and data logging aspect of an

industrial process. These areas are of prime

importance for computer control of an industrial

process. The signal is acquired from the filed and

different signal processing and analysis function is

performed on the acquired signal on the selected

portion of the signal. The selected portions of the

signal along with its mathematical values are stored

in a log file for record keeping and future reference

and analysis.

In future scope of the paper, a wireless

web based data acquisition, data logging and

supervisory control system can be implemented.

The main advantage of wireless web based data

acquisition system is that any authorized person in

any where in the world can access the real time

process data with the help of internet. The main

concern area of web based data logging and

supervisory control system is the security of data

and authentication of the user. To solve the above

security need a firewall can be implemented.

References

[1] Joseph Luongo, “A Multichannel Digital

Data Logging System,” IRE Transactions on

Instrumentation, Jun 1958, pp. 103-106.


SIP0202-5

[2] Rik Pintelon, Yves Rolain, M. Vanden

Bossche and J. Schoukens, “Towards an

Ideal Data Acquisition Channel,” IEEE

Transactions on Instrumentation and

Measurement, vol. 39, no. 1, Feb 1990, pp.

116-120.

[3] Deichert, R.L., Burris, D.P., Luckemeyer, J.,

“Development of a High Speed Data

Acquisition System Based on LabVIEW

and VXI,” in Proceedings of IEEE

Autotestcon, Sep 1997, pp. 302-307.

[4] F. Figueroa, S. Griffin, L. Roemer and J.

Schmalzel, “A Look into the Future of Data

Acquisition”, IEEE Instrumentation and

Measurement Magazine, vol. 2, issue 4,

Dec1999, pp. 23–34.

[5] A. Ferrero, L. Cristaldi and V. Piuri,

“Programmable Instruments, Virtual

Instruments, and Distributed Measurement

Systems: what is Really Useful, Innovative,

and Technically Sound”, IEEE

Instrumentation and Measurement

Magazine, vol. 2, issue 3, Sep 1999, pp. 20–

27.

[6] P. Strachan, A. Oldroyd, M. Stickland,

“Introducing Instrumentation and Data

Acquisition to Mechanical Engineers Using

LabVIEW,” International Journal of

Engineering Education, vol. 16, no. 4, Jan

2000, pp. 315-326

[7] K K Tan, T H Lee, F M Leu, “Development

of a Distant Laboratory Using LabVIEW,”

International Journal of Engineering

Education, vol. 16, no. 3, 2000, pp. 273-282

[8] Amit Chaudhuri, Amitava Akuli and Abhijit

Auddy, “Virtual Instrumentation Systems-

Some Developments in Power Plant

Training and Education,” IEEE ACE, Dec

2002

[9] Melanie L Higa, Dalia M Tawy and Susan

M Lord, “An Introduction to LabVIEW

Exercise for an Electronic Class,” 32nd

ASEE/IEEE Frontiers in Education

Conference, Nov 2002, T1D-13-T1D-16

[10] Recayi Pecen, M.D Salim, Ayhan Zora, “A

LabVIEW Based Instrumentation System

for a Wind-Solar Hybrid Power Station,”

Journal of Industrial Technology, vol. 20,

no. 3, Jun-Aug 2004.

[11] Sarang Bhutada, Siddarth Shetty, Rohan

Malye, Vivek Sharma, Shilpa Menon,

Radhika Ramamoorthy, “Implementation of

a Fully Automated Greenhouse using

SCADA Tool like LabVIEW,” in

Proceedings of the 2005 IEEE/ASME

International Conference on Advanced

Intelligent Mechatronics, Jul 2005, pp. 741-

746.

[12] Samuel Daniels, Dave Harding, Mike

Collura, “Introducing Feedback Control to

First Year Engineering Students Using

LabVIEW,” in Proceedings of 2005

American Society for Engineering

Education Annual Conference &

Exposition, 2005, pp. 1-12

[13] Mihaela Lascu and Dan Lascu, “Feature

Extraction in Digital Mammography Using

LabVIEW,” 2005 WSEAS International

Conference on Dynamical Systems and

Control, Nov 2005, pp. 427-432

[14] V M Cristea, A Imre-Lucaci, Z K Nagy and

S P Agachi, “E-Tools for Education and

Research in Chemical Engineering,”

Chemical Bulletin, vol. 50, issue 64, 2005,

pp. 14-17

[15] Ziad Salem, Ismail Al Kamal, Alaa Al

Bashar, “A Novel Design of an Industrial

Data Acquisition System,” in Proceedings

of International Conference on Information

and Communication Techniques, Apr 2006,

pp. 2589-2594.

[16] Aditya N. Das, Frank L. Lewis, Dan O.

Popa, “Data-logging and Supervisory

Control in Wireless Sensor Networks,” in

Proceedings of 7th

ACIS international

conference on software engineering,

Artificial Intelligence, Networking and

Parallel Distributed Computing (SNDP’06),

2006, pp. 1-12

[17] K. S Swarup and P. Uma Mahesh,

“Computerized Data Acquisition for Power

System Automation,” in Proceedings of

Power India Conference, Jun 2006, pp. 1-7.

[18] Francesco Adamo, Filippo Attivissimo,

Giuseppe Cavone, Nicola Giaquinto,

“SCADA/HMI Systems in Advanced

Educational Courses,” IEEE Transactions

on Instrumentation and Measurement, vol.


SIP0202-6

56, no. 1, Feb 2007, pp. 4-10.

[19] Vu Van Tan, Dae-Seung Yoo, Myeong-Jae

Yi, “A Novel Framework for Building

Distributed Data Acquisition and

Monitoring System,” Journal of Software,

vol.2, no.4, Oct 2007, pp. 70-79

[20] A Hammad, A Hafez, M T Elewa, “A

LabVIEW Based Experimental Platform for

Ultrasonic Range Measurements,” DSP

Journal, vol. 6, issue 2, Feb 2007, pp. 1-8

[21] Shekhar Sharad, “A Biomedical

Engineering Start Up Kit for LabVIEW,”

Americal Society f Engineering Education,

2008

[22] Steve Warren and James DeVault, “A Bio

Signal Acquisition and Conditioning Board

as a Cross-Course Senior Design Project,”

in Proceedings of 38th

ASEE/IEEE Frontiers

in Education Conference, 2008, pp. S3C1-

S3C6

[23] S K Bath, Sanjay Kumra, “Simulation and

Measurement of Power Waveform

Distortion Using LabVIEW,” IEEE

International Power Modulators and High

Voltage Conference, May 2008, pp. 427-

434

[24] Nikunja K Swain, James A Anderson and

Raghu B. Korrapati, “Study of Electrical

Power Systems using LabVIEW VI

Modules,” in Proceedings of 2008 IAJC-

IJME International Conference, 2008

[25] M. Usama Sadar, “Synchronous Generator

Simulation Using LabVIEW,” World

Academy of Science, Engineering and

Technology, 39, 2008, pp. 392-400

[26] Muhammad Noman Ashraf, Syed Annus

Bin Khalid, Muhammad Shahrukh Ahmed,

Ahmed Munir, “Implementation of Intranet-

SCADA using LabVIEW based Data

Acquisition and Management,” in

Proceedings of International Conference on

Computing, Engineering and Information,

2009, pp. 244-249.

[27] Zafer Aydogmus, Omur Aydogmus, “A

Web-Based Remote Access Laboratory

Using SCADA,” IEEE Transactions on

Education, vol. 52, no. 1, Feb 2009.

[28] Li Nailu, Lv Yuegang, Xi Peiyu, “A Real

Time Simulation System of Wind Power

Based on LabVIEW DSC Module and

Matlab/Simulink,” in Proceedings of The

Ninth International Conference on

Electronic Measurement & Instruments,

Aug 2009, pp. 1-547-1-552.

[29] Hiram E Ponce, Dejanira Araiza and Pedro

Ponce, “A Neuro-Fuzzy Controller for

Collaborative Applications in Robotics

Using LabVIEW,” Applied Computational

Intelligence and Soft Computing, Hindawi

Publishing Corporation, vol. 2009, 2009, pp.

1-9

[30] Akif Kutlu, Kubilay Tasdelen, “Remote

Electronic Experiments Using LabVIEW

Over Controller Area Network,” Scientific

Research and Essays, vol. 5(13), Jul 2010,

pp. 1754-1758

[31] Pedro Ponce Cruz, Aruto Molina Gutierre,

“LabVIEW for Intelligent Control Research

and Education,” 4th

IEEE International

Conference on E-Learning in Industrial

Electronics, Nov 2010, pp. 47-54

[32] David McDonald, “Work In Progress

Introductory LabVIEW Real Time Data

Acquisition Laboratory Activities,” ASEE

North Central Sectional Conference, Mar

2010, pp. 1B-1-1B-6


SIP0203-1

Abstract-Orthogonal Frequency Division

Multiple Access is a scheme which divide the

available spectrum into subchannels. The

subchannels are narrowband which makes

equalization very simple.The intercarrier

interference in the subcarriers occurs due to

frequency offset.The OFDM is sensitive to

frequency offset between transmitted and

received carrier frequencies. This results in

the reduction of signal amplitude in the output

of the filters matched to each of the carriers

and the second is introduction of ICI from the

other carriers. The two methods are

investigated for combating the effects of ICI:

ICI Self Cancellation (SC) and Extended

Kalman Filter (EKF) method. The methods are

compared in terms of bandwidth efficiency and

bit error rate. EKF methods perform better

than the SC method as shown by

simulations(upto 256 QAM).

Keywords- Orthogonal frequency Division

Multiplexing(OFDM); Inter Carrier

Interference(ICI); Carrier to Interference Power

Ratio (CIR);Self Cancellation(SC);Carrier

Frequency Offset (CFO); Extended Kalman

Filtering(EKF).

I. INTRODUCTION

The basic principle of OFDM is to split high-rate

datastream into a number of lower rate streams

that are transmitted simultaneously over a number

ofsubcarriers.[1]

One limitation of OFDM in many applications is

that it is very sensitive to frequency errors caused

by frequency differences between the local

oscillators in the transmitter and the receiver [3]–

[5]. Frequency offset causes rotation and

attenuation of each of the subcarriers and

intercarrier interference (ICI) between subcarriers.

[4].Many methods have been developed to reduce

this sensitivity to frequency offset which includes

windowing of the transmitted signal [6], [7] and

use of self ICI cancellation schemes [8]. Here in

this paper, the effects of ICI have been analysed

and two solutions to combat ICI have been

presented. The first method is a self-cancellation

scheme[1], in which redundant data is transmitted

onto adjacent sub-carriers such that the ICI

between adjacent sub-carriers cancels out at the

receiver. The second method, the extended

Kalman filter (EKF), statistically estimate the

frequency offset and correct the offset [7], using

the estimated value at the receiver. The works

presented in this paper concentrate on a

quantitative ICI power analysis of the ICI

cancellation scheme, which has not been studied

previously. The average carrier-to interference

power ratio (CIR) is used as the ICI level

indicator, and a theoretical CIR expression is

derived for the proposed scheme.

OFDM SYSTEM DESCRIPTION OFDM system uses the input bit stream which is

multiplexed into N symbol streams, each with

symbol period T, and each symbol stream is used

to modulate parallel, synchronous sub-carriers

[10]. The sub-carriers are spaced by 1 in

METHODS OF INTERCARRIER INTERFERENCE

CANCELLATION FOR ORTHOGONAL FREQUNCY

DIVISION MULTIPLEXING

Dr. R.L.Yadav Mrs.Dipti Sharma

Prof., ECE Dept. Sr.Lecturer

Galgotia College of Engg.&Tech Apex Institute Of Tech.,

Greater Noida Rampur

email:[email protected] email:[email protected]


SIP0203-2

frequency, thus they are orthogonal over the

interval (0,Ts). Then, the N symbols are mapped

to bins of an inverse fast Fourier transform

(IFFT). The IFFT bins correspond to the

orthogonal sub-carriers in the OFDM symbol.

Thus, the OFDM symbol is expressed as

where the Xm’s are the baseband symbols on each

sub-carrier. The analog time-domain signal is

obtained using digital to analog(D/A) converter.

This discrete signal is demodulated using an N-

point Fast Fourier Transform (FFT) operation at

the receiver. The demodulated symbol is

where w (m) corresponds to the FFT of the

samples of w (n), which is the Additive White

Gaussian Noise (AWGN) in the channel Then, the

signal is down converted and transformed to a

digital sequence after through an Analog-to-

Digital Converter (ADC). Then following step is

to pass the remaining TD samples through a

parallelto- serial converter and to compute N-

point FFT. The resulting Yi complex points are

the complex baseband representation of the N

modulated sub carriers. As the broadband channel

has been decomposed into N parallel sub

channels.Each sub channel needs an. These blocks

are called Frequency Domain Equalizers

(FEQ).The bits on the transmitter are received at

high data rates at receiver.

III. ICI SELF CANCELLATION SCHEME

A. Self-Cancellation

ICI self-cancellation is a scheme that was

introduced by Zhao and Sven-Gustav Häggman[1]

in order to combat and suppress ICI in OFDM.

The input data is modulated into group of

subcarriers with coefficients such that the ICI

signals so generated within that group cancel each

other.Thus it is called self-cancellation method.

1) Cancellation Method

The data pair (X ,- X ) is modulated on to two

adjacent subcarriers (l,l +1) . The ICI signals

generated by the subcarrier l will be cancelled out

significantly by the ICI generated by the

subcarrier l +1. In considering a further reduction

of ICI, the ICI cancellation demodulation scheme

is used. In this scheme, signal at the (k +1)

subcarrier is multiplied by"-1" and then added to

the one at the k subcarrier. Then, the resulting data

sequence is used for making symbol decision.

2). ICI Cancelling Modulation

The ICI self-cancellation scheme requires that the

transmitted signals be constrained such that

X(1) = -X(0), X(3) = -X(2),......., X(N -1) = -X(N -

2) using this assignment of transmitted symbols

allows the received signal on subcarriers k and

k+1 to be written as

and the ICI coefficient S’(l-k) referred as

S’(l-k)=S(l-k)-S(l+1-k) (5)

0 20 40 60 80 100 120-70

-60

-50

-40

-30

-20

-10

0

Subcarrier index k

dB

Comparrison of |S(l-k)|, |S`(l-k)|, and |S``(l-k)| for = 0.4 and N = 128

|S(l-k)|

|S`(l-k)|

|S``(l-k)|

Fig.1 Comparison of |S(l-k)|, |S`(l-k)|, and |S``(l-k)| for N = 128 and

ε = 0.4

3) ICI Canceling Demodulation

ICI modulation introduces redundancy in the

received signal since each pair of subcarriers

transmit only one data symbol. This

redundancy can be exploited to improve the

system power performance, while it surely

decreases the bandwidth efficiency. To take

advantage of this redundancy, the received

signal at the (k + 1)th subcarrier, where k is

even, is subtracted from the kth subcarrier.


SIP0203-3

0 5 10 150

0.2

0.4

0.6

0.8

Subcarrier index k

|S(l-

k)|

0 5 10 15-0.1

0

0.1

0.2

0.3

Subcarrier index k

Real

(S(l-

k))

0 5 10 15

-0.5

0

0.5

Subcarrier index k

Imag

(S(l-

k))

Fig.2 An example of S(l - k) for N = 16; l = 0. (a) Amplitude of S(l

- k). (b) Real part of S(l - k). (c) Imaginary part of S(l - k).

This is expressed mathematically as

Subsequently, the ICI coefficients for this

received signal becomes

S’(l-k) =-S(l-k-1) +2S(l-k) -S(l-k+1) (7)

When compared to the two previous ICI

coefficients S(1-k) for the standard OFDM system

and S(1-k) for the ICI canceling modulation, S ''(1-

k) has the smallest ICI coefficients, for the

majority of l-k values, followed by S(1-k) and S(1-

k) .The combined modulation and demodulation

method is called the ICI self-cancellationscheme..

The theoretical CIR can be derived as

As mentioned above, the redundancy in this

scheme reduces the bandwidth efficiency by half.

This could be compensated by transmitting signals

of larger alphabet size. The Fig. 3 shows the

model of the proposed method.

Fig.3 OFDM Simulation Model

ICI self-cancellation scheme can be combined

with error correction coding. The proposed

scheme provides significant CIR improvement,

which has been studied theoretically and by

simulationsFig. 4 shows the comparison of the

theoretical CIR curve of the ICI self-cancellation

scheme, calculated by, and the CIR of a standard

OFDM system is calculated. As expected, the CIR

is greatly improved using the ICI selfcancellation

scheme. The improvement can be greater than

15dB for 0 < ε < 0.5.

Fig.4 CIR versus ε for a standard OFDM system

EXTENDED KALMAN FILTERING

A. Problem Formulation

A state-space model of the discrete Kalman filter

is defined as

z (n) = a (n)d(n) + v(n) (9)

For the model z(n) has a linear relationship with

the desired value d(n). By using the discrete

Kalman filter, d(n) can be recursively estimated

based on the observation of z(n) and the updated

estimation in each recursion is optimum in the

minimum mean square sense.

The received symbols are

At the receiver

In order to estimate ε efficiently in computation,

we build an approximate linear relationship using

the first-order Taylor’s expansion:

(12-17)

Where is the estimation of

And (15)

And the following relationship:-


SIP0203-4

which has the same form as, i.e., z(n) is linearly

related to d(n).. As linear approximation is

involved in the derivation, the filter is called the

extended Kalman filter(EKF)

.

B. ICI Cancellation

There are two stages in EKF method to reduce

intercarrier interference.

1). Offset Estimation Scheme

For estimating the quantity ε(n) using an EKF in

each OFDM frame, the state equation is built as

ε(n)=ε(n-1) (18)

i.e., in this case we are estimating an unknown

constant ε. This constant is distorted by a non-

stationary process x(n), an observation of which is

the preamble symbols preceding the data symbols

in the frame. The observation equation is

where y(n) denotes the received preamble

symbols distorted in the channel, w(n) the

AWGN, and x(n) the IFFT of the preambles X(k)

that are transmitted, which are known at the

receiver. Assume there are Np preambles

preceding the data symbols in each frame are used

as a training sequence and the variance σ of the

AWGN w(n) is stationary. The computation

procedure is described as follows.

1. Initialize the estimate and corresponding state

error P(0).

2. Compute the H(n), the derivative of y(n) with

respect to ε (n) at , the estimate obtained in the

previous iteration.

3. Compute the time-varying Kalman gain K(n)

using the error variance P(n- 1), H(n), and σ2

4. Compute the estimate y^(n)using x(n) and ε^(n-

1)., i.e. based on the observations up to time n-1,

compute the error between the true observation

y(n) and y^(n).

5. Update the estimate ε^(n) by adding the K(n)-

weighted error between the observation y(n) and

y^(n) to the previous estimation ε^(n-1).

6. Compute the state error P(n) with the Kalman

gain K(n), H(n), and the previous error P(n-1).

7. If n is less than Np, increment n by 1 and go to

step 2; otherwise stop.

It is observed that the actual errors of the

estimation ε^(n) from the ideal value ε(n) are

computed in each step and are used for adjustment

of estimation in the next step.

The pseudo code of computation is summarized as

Initialize P(n),ε^(0).For n=1,2,…….NP compute

2). Offset Correction Scheme

The ICI distortion in the data symbols x(n) that

follow the training sequence can then be mitigated

by multiplying the received data symbols y(n)

with a complex conjugate of the estimated

frequency offset and applying FFT, i.e.

SIMULATED RESULT ANALYSIS

A. Performance

For the simulations in this paper, MATLAB was

employed with its Communications

Toolbox,Communication Blockset for all data

runs. To compare the two schemes BER

performance curve is used The OFDM transceiver

system was implemented as specified by Fig.

3..Quadrature amplitude modulation QAM(64 ,

128and 256) is used.

PARAMETERS VALUES

Number of carriers 768

Modulation QAM

Frequency offset [0,0.15,0.30]

No. of OFDM symbols 100

Bits per OFDM symbols N*log2(M)

Eb-No 1:15

IFFT size 1024

Fig.5 BER Performance with ICI Cancellation, ε=0.05 for 64-QAM


SIP0203-5

Fig

.6 BER Performance with ICI Cancellation ε=0.15, ε=0.3 for 128

QAM

Fig.7BER Performance with ICI cancellation ε=0.15, ε-=0.30 for 256 QAM

S.No. Method ε= 0.05 ε= 0.15 ε= 0.30

1 SC 13 dB 12 dB 11 dB

2 EKF 12dB 13 dB 14 dB

Required SNR and improvement for BER of 10^-6 for QAM

CONCLUSION

In this paper, the performance of OFDM systems

in the presence of frequency offset between the

transmitter and the receiver has been studied in

terms of the Carrier-to-Interference ratio (CIR)

and the bit error rate (BER) performance. Inter-

carrier interference (ICI) which results from the

frequency offset degrades the performance of the

OFDM system. Two methods were explored in

this paper for mitigation of the ICI. The ICI self-

cancellation (SC) is proposed . The extended

Kalman filter (EKF) method for estimation and

cancellation of the frequency offset has been

investigated in this paper, and comparison is made

with these two existing techniques. The choice of

which method to employ depends on the specific

application. For example, self cancellation does

not require very complex hardware or software for

implementation. However, it is not bandwidth

efficient as there is a redundancy of 2 for each

carrier. Its implementation is more complex than

the SC method. On the other hand, the EKF

method does not reduce bandwidth efficiency as

the frequency offset can be estimated from the

preamble of the data sequence in each OFDM

frame.

However, it has the most complex implementation

of the two methods. In addition, this method

requires a training sequence to be sent before the

data symbols for estimation of the frequency

offset. It can be adopted for the receiver design for

IEEE 802.11a because this standard specifies

preambles for every OFDM frame. This model

can be easily adapted to a flat-fading channel with

perfect channel estimation. Further work can be

done by performing simulations to investigate the

performance of these ICI cancellation schemes in

multipath fading channels without perfect channel

information at the receiver. In this case, the

multipath fading may hamper the performance of

these ICI cancellation schemes.

REFERENCES:-

[1]P. Tan, N.C. Beaulieu, ―Reduced ICI in

OFDM systems using the better than raised cosine

Pulse,‖ IEEE Commun. Lett, vol. 8, no. 3, pp.

135–137, Mar. 2004.

[2] H. M. Mourad, Reducing ICI in OFDM

systems using a proposed pulse shape, Wireless

Person. Commun, vol. 40, pp. 41–48, 2006.

[3] V. Kumbasar and O. Kucur, ―ICI reduction

in OFDM systems by using improved Sinc power

pulse,‖ Digital Signal Processing, vol.17, Issue

6, pp. 997-1006, Nov. 2007.

[4] Tiejun (Ronald) Wang, John G. Proakis, and

James R.Zeidler“Techniques for suppression of

intercarrier interference in ofdm systems”.

Wireless Communications and Networking

Conference, IEEE Volume 1,Issue, 13-17 pp: 39 -

44 Vol.1,2005.

[5]P. H. Moose, “A Technique for Orthogonal

Frequency Division Multiplexing Frequency

Offset Correction,” IEEE Transactions on

Communications, vol. 42, no. 10, 1994

[6]Y. Zhao and S. Häggman, “Inter carrier

interference self-cancellation scheme for OFDM

mobile communication systems,”IEEE

Transactions on Communications, vol. 49, no. 7,

2001

[7] R. E. Ziemer, R. L. Peterson, ”Introduction to

Digital Communications”, 2Nd edition, Prentice

Hall, 2002.

[8] J. Armstrong, “Analysis of new and existing

methods of reducing inter carrier interference due

to carrier frequency offset in OFDM,” IEEE

Transactions on Communications, vol. 47, no. 3,

pp. 365 – 369., 1999


SIP0203-6

[9] N. Al-Dhahir and J. M. Cioffi, “Optimum

finite-length equalization for multicarrier

transceivers,” IEEE Transactions

onCommunications, vol. 44, no. 1, pp. 56 – 64,

1996Systems”, (IJCSIS) International Journal of

Computer Science and Information Security,

Vol. 6, No. 3, 2009


SIP0204-1

OBJECT DETECTION BASED ON CROSS-CORRELATION

USING PARTICLE SWARM OPTIMIZATION

Sudhakar Singh Yaduvir Singh


Thapar University, Patiala, Punjab

[email protected]

Abstract- In this paper a novel method for

object detection in images is proposed. The

method is based on image template

matching. Conventional template matching

algorithm based on cross-correlation

requires complex calculation and large time

for object detection, which makes difficult to

use them in real time applications. In the

proposed work particle swarm optimization

and its variants based algorithm is used for

detection of object in image. Implementation

of this algorithm reduces the time required

for object detection than conventional

template matching algorithm. Algorithm can

detect object in less number of iteration &

hence less time and energy than the

complexity of conventional template

matching. This feature makes the method

capable for real time implementation.

Keywords: object detection, object tracking

and image matching.

I. INTRODUCTION

It is easy in image to detect the position

of the letters, objects, numbers, for human even

for child, but for computer solve these types of

problems in fast manner is a very challenging

task. In the last decades the computer‟s ability to

perform huge amount of calculations, and handle

information flows we never thought possible ten

years ago has emerged. Despite this a computer

can only extract little information from the

image in comparison to human being. Object

detection is a fundamental component of

artificial intelligence and computer vision.

Interest in pattern recognition is fast growing in

order to deal with the prohibitive amount of

information, we encounter in our daily life.

Automation is desperately needed to handle this

information explosion. The way the human brain

filters out useful information is not fully known

and this skill has not been merged into computer

vision science. This paper proposes to

implement a system that is able to faster

detection of object in an image. Artificial

intelligence is an important topic of the current

computer science research. In order to be able to

act intelligently a machine should be aware of its

environment. The visual information is essential

for humans. Therefore, among many different

possible sensors, the cameras seem very

important. Automatically analyzing images and

image sequences is the area of research usually

called “computer vision. Image matching is key

point for object detection. Image matching has

large no. of applications which includes in

navigation, guidance, automatic surveillance,

robot vision, and in mapping sciences. Cross-

correlation and related techniques are

dominantly used in image matching

applications. It is difficult to use this

Conventional template matching algorithm

based on cross-correlation in real time

applications due to requirement of complex

calculation and large time for object detection

applications. The shortcomings of this class of

image matching methods have caused a slow-

down in the novel development of operational

automated correlation systems. In this paper, we

propose a method for object detection. It


SIP0204-2

consists three stages, (i) Image matching using

templates. (ii) Object detection. (iii) Then

implementation of PSO technique.

In this paper proposed PSO based

algorithm is better which gives better result as

compare to conventional algorithm.

II. LITERATURE REVIEW

F. Ackermann [1] proposed an image matching

algorithm based on least squares window

matching. Several common object detection and

tracking methods are surveyed in [2], such as

point detectors , background subtraction [7], In

fact, color is one of the most widely used

features to represent the object appearance for

detection and tracking [5]. Most of object

detection and tracking methods used pre-

specified models for object representation. W.

Forstner [3] proposed a feature based

correspondence algorithm for image matching A

W Gruent [4]. The Adaptive Least Squares

Correlation is a very potent and flexible

technique for all kinds of data matching

problems, J. Bala, K.[5]. They address the

problem of crafting visual routines for detection

tasks. C.F.Olson [6] in image matching

applications such as tracking and stereo

matching. Kwan-Ho Lin, Kin-Man [8] new

method for locating object based on valley field

detection and measurement of fractal

dimensions. Yaakov Hel-Or [10] a novel

approach to pattern matching is proposed in

which time complexity is reduced by two orders

of magnitude compared to traditional

approaches. Kun Peng, Liming Chen [9]

presented a robust eye detection algorithm for

gray intensity images.

III. OBJECT DETECTION

Object detection attempts to determine

the existence of specific object in an image and,

if object is present, then it determines the

location, size and shape of that object. In

computer vision, object detection and tracking is

an active research area which has attracted

extensive attentions from multi-disciplinary

fields, and it has wide applications in many

fields like service robots, surveillance systems,

public security systems, and virtual reality

interfaces. Detection and tracking of moving

object like car and people are more concerned,

especially flexible and robust tracking

algorithms under dynamic environments, where

lightening condition may change and occlusions

may happen. The general process of object

detection consists of two steps. The first step is

building models. The second step is according to

the prior knowledge of the interested objects, the

feature model is built up to describe the target

object and separate it from other objects and

backgrounds. And since most images are noisy,

statistic information are usually adopted to

quantify features. The second step is to find a

particular region in the image; called area of

interest (AOI), which either can best fit the

object model or has the highest similarity with

the model. Many algorithms developed recently

in this area relate to human face detection and

recognition due to its potential applications in

security and surveillance. Yet, generic, reliable,

and fast human face detection was, until very

recently, impossible to achieve in real-time. The

concepts involved in object detection, object

recognition, and object tracking often overlap.

Each of these computer vision techniques tries to

achieve the following: Object Tracking:

dynamically locates objects by determining their

position in each frame. Object Detection and

Recognition has made significant progress in the

last few years. Many algorithms developed

recently in this area relate to human face

detection and recognition due to its potential

applications in security and surveillance.

IV. TEMLATE MATCHING BASED ON

CROSS CORRELATION

Template matching is a popular method

for pattern recognition. It is defined below:

Definition: Let I be an image of dimension m×n

and T be another image of dimension p×q such

that p<m and q<n then template matching is

defined as a search method which finds out the

portion in I of size p×q where T has the

maximum cross correlation coefficient with it.

The normalized cross correlation coefficient is

defined as:


SIP0204-3

( , ) ( , )

s t

2 2

( , ) ( , )

Y(x,y)=I x s y t I s t

I x s y t I s t

s t s t

P P

P P

(1)

Where

( , )

( , )

( , ) ( , )

( , )

I x s y t

I x y

P I x s y t I x y

P T s s T

(1,2,3....... )s p , (1,2,3.... )t q and

(1,2,3.... 1)x m p ,

(1,2,3.... 1)y n q

Also

1( , ) ( , )

s t

I x y I s t y tpq (2)

1( , )

s t

T T s tpq (3)

The value of cross-correlation coefficient Y

ranges in [-1, +1]. A value of +1 indicates that T

is completely matched with I(x, y) and -1

indicates complete disagreement. For template

matching the template, T slides over I and Y is

calculated for each coordinate (x, y). After

completing this calculation, the point which

exhibits maximum Y is referred to as the match

point.

V. PARTICLE SWARM OPTIMIZATION

Particle Swarm Optimization (PSO)

algorithm is a kind of evolutionary

computational technique developed by Kennedy

and Eberhart in 1995 [5]. Like other

evolutionary techniques, PSO also uses a

population of potential solutions to search the

explore space. In PSO, the population dynamics

resembles the movement of a “birds‟ flock”

searching for food, while social sharing of

information takes place and individuals can gain

and from the discoveries previous experience

from all other companions. Thus, the companion

(called particle) in the population (called swarm)

is assumed to “fly” over these search space in

order to find promising region of the landscape.

Let, particle i of the swarm is represented by the

dimensional vector xi = (xi1, xi2, ….,xid ) and the

best particle of the swarm, is denoted by the

index g. The best previous position of particle i

is recorded and represented as pi= (pi1,pi2,

…,pid). The position change (velocity) of

particle i is Vi=(Vi1, Vi2, …, Vid). Particles

update their velocity and position through

tracing two kinds of „best‟ value. One is its

personal best (pbest), which is the location of its

highest fitness value. Another is the global best

(gbest), which is the location of overall best value,

obtained by any particles in the population.

Particles update their positions and velocities

according to the following equations:

vj(i) = wvj(i-l) +r1 c1(pbest(j) – xj(i)) +r2 c2(gbest–

xj(i)) (4)

pj (i) = pj(i - 1) + vj(i) (5)

Where, vj(i) is the velocity of the jth particle in

the ith iteration, pj (i) is the corresponding

position, pbest and gbest corresponding persona

lbest and global best respectively, the variables

w is the inertia weight, the variables c1 and c2

are the accelerate parameters and r1 and r2 are

random numbers . A number of scientists have

created computer simulations of various

interpretations of the movement of organisms in

a bird flock or fish school. Notably, Reynolds

and Heppner and Germander presented

simulations of bird flocking. It became obvious

during the development of the particle swarm

concept that the neighbours of the population of

agents are more like a swarm than a flock. The

term swam has a basis in the literature. In

particular, the authors use the term in

accordance with a paper by Millons, who

developed his models for applications in

artificial life, and articulated five basic

principles of swarm intelligence. First is the

proximity principle: the population should be

able to carry out simple space and time

computations. Second is the quality principle:

the population should be able to respond to

quality factors in the environment. Third is the

principle of diverse response: the population


SIP0204-4

should not commit its activities along

excessively narrow. Fourth is the principle of

stability: the population should not change its

mode of neighbour every time the environment

changes. Fifth is the principle of ability: the

population must be able to change the behaviour

mode when it‟s worth the computational price.

Note that principles four and five are the

opposite sides of the same coin. Particle swarm

optimization concept and paradigm presented

seem to adhere to all five principles. Basic to the

paradigm are n-dimensional space calculations

carried out over a series of time steps. The

population is responding to the quality factors

local best. Further, liccvcs discusses particle

systems consisting of clouds of primitive

particles as models of diffuse objects such as

clouds, fire and smoke. Thus the label the

authors have chosen to represent the

optimization concept is particle swarm.

Figure 1: Flow chart of PSO

VI.EXPERIMENTAL RESULTS AND

DISCUSSION

The algorithm of particle swarm

optimization is applied for image and different

templates for solving the problem of object

detection. Each image is tested on more than 15

times.


SIP0204-5

Figure 2: Test image-1

Figure (2.1) (2.2) (2.3)

(2.1) Left eye taken as a template (T1.1)

(2.2) Right eye taken as template (T1.2)

(2.3) Nose taken as template (T1.3)


Figure (3.1) (3.2) (3.3)

(3.1) Right eye taken as a template (T2.1)

(3.2) Left eye taken as template (T2.2)



Figure (4.1) (4.2) (4.3)

(4.1) Right eye taken as a template (T3.1)

(4.2) Left eye taken as template (T3.2)


Table I below shows comparison of pso and

conventional algorithm.

Ima

ges

Temp

lates

Iterat

ions

Conven

tional

algorith

m Time

taken in

sec.

PSO

Algor

ithm

Time

taken

in sec.

%

Redu

ced

Time

by

PSO

in

sec.

Ima

ge

(1)

Temp

late

1.1

100 57.38 3.90 93.2

1

Temp

late

1.2

100 110.14 4.14 96.2

4

Temp

late

1.3

100 130.56 4.36 96.6

6

Ima

ge

(2)

Temp

late

2.1

100 111.70 4.46 96.0

1

Temp

late

2.2

100 120.60 4.53 96.2

6

Temp

late

100 143.34 4.65 96.7

5


SIP0204-6

2.3

Ima

ge

(3)

Temp

late

3.1

100 59.47 3.72 93.7

5

Temp

late

3.2

100 52.43 3.67 93.0

1

Temp

late

3.3

100 74.37 3.89 94.7

7

Table 1: Comparison of conventional and PSO

based algorithms

By conventional algorithm template

time taken for detection of right eye (object) in

test image1 by matching of template1.1 is 57.38

sec., while in the proposed algorithm the time

taken is 3.90 sec. and hence time reduces up to

93.21% by proposed algorithm. By conventional

algorithm time taken for detection of left eye

(object) in test image1 by matching of

template1.2 is 110.14 sec., while in the proposed

algorithm time is 4.14 sec. and reduces time up

to 96.24%. By conventional algorithm time

taken for detection of nose (object) in test

image1 by matching of template1.3 is 130.56

sec., while in the proposed algorithms the time

taken is 4.34 sec. and reduces time up to

96.66%. By conventional algorithm time taken

for detection of right eye (object) in test image2

by matching of template2.1 is 111.70 sec., while

in the proposed algorithm the time taken is 4.46

sec. and reduces time up to 96.01% By

conventional algorithm time taken for detection

of left eye (object) in test image2 by matching of


algorithm time taken is 4.53 sec. and reduces

time up to 96.26%. By conventional algorithm

time taken for detection of left eye (object) in

test image2 by matching of template2.3 is

143.34 sec., while in the proposed algorithms

the time taken is 4.65 sec. and reduces time up

to 96.75%. By conventional algorithm time

taken for detection of left eye (object) in test

image3 by matching of template3.1 is 59.47 sec.,

while in the proposed algorithm the time taken is

3.72 sec. and reduces time up to 93.75% By

conventional algorithm time taken for detection

of left eye (object) in test image3 by matching of


algorithms the time taken is 3.62 sec. and

reduces time up to 93.01% By conventional

algorithm time taken for detection of left eye

(object) in test image3 by matching of


algorithms the time is 3.89 sec. and reduces time

up to 94.77%. Thus PSO is successfully

employed to solve the object detection problem.

The results show that the proposed method is

capable of obtaining higher quality solution

efficiency. Here time taken is considered as the

efficiency measure. It is clear from the results

that the proposed PSO based method can avoid

the shortcoming (large time taken) of Old

template matching algorithm and can provide

higher quality solution with better computation

efficiency.

VII. CONCLUSIONS

When the sample test images are tested

on PSO based algorithm for detecting the

position of object then it is found that the

algorithms are capable of detecting the position

of object in image with very less time as

compared to conventional template matching

algorithm. The PSO based algorithm has

superior features, including high-quality

solution, stable convergence characteristic and

good computation efficiency. The future work of

the proposed paper is to detect the exact location

of object by segmentation by finding area and

perimeter of object.

REFERENCES

1. Ackermann, F. 1984. “Digital image

correlation: Performance and potential

application in photogrammetry”.

Photogrammetric Record 11

2. T.Peli, “An algorithm for recognition and

localization of rotated and scaled objects”,

Proceedings of the IEEE 69, 1981 483–485.

3. Foerstner,W.,“Quality assessment of object

location and point transfer using digital

image correlation techniques. International

Archives of Photogrammetry and Remote

Sensing” vol. XXV, A3a, Commission III,

Rio de Janeiro, 1984.


SIP0204-7

4. A W Gruent, “Adaptive least squares

correlation: A powerful image matching

Technique.”, South African Journal of

Photogrammetry, Remote Sensing &

Cartography, 14(3):175–187, 1985.

5. J. Bala, K. DeJong, J. Huang, H. Vafaie, H.

Wechsler, “Visual routine for eye detection

using hybrid genetic architectures,”

International Conference on Pattern

Recognition, vol. 3, pp. 606-610, 1996.

6. C.F. Olson, “Maximum-likelihood template

matching,” IEEE Conference on Computer

Vision and Pattern Recognition, vol. 2, pp.

52-57, 2000.

7. Feng Zhao, Qingming Huang, Wen Gao,

“Methods of image matching by normalized

cross-correlation”.

8. Kwan-Ho Lin, Kin-Man Lam and Wan-Chi

Siu, "Locating the Eye in Human Face

Images Using Fractal Dimensions," IEE

Proceedings - Vision, Image and Signal

Processing, vol. 148, no. 6, pp. 413-421,

2001.

9. Kun Peng, Liming Chen, Su Ruan, Georgy

Kukharev, “A Robust and Efficient

Algorithm for Eye Detection on Gray

Intensity Face,” Lecture Notes in Computer

Science – Pattern Recognition and Image

Analysis, pp. 302-308, 2005.

10. Yacov Hel-Or, Hagit Hel-Or, “Real-Time

Pattern Matching Using Projection Kernels”,

IEEE transactions on pattern analysis and

machine pattern analysis and machine

intelligence, Vol. 27, No. 9, September,

2005.

11. Zeng Yan et.al „‟A New Background

Subtraction Method for on-road Traffic,”

Journal of Image and Graphics, vol.13,

No.3, pp.593-599, March, 2008

12. Wei-feng Liu et.al „‟ A Target Detection

Algorithm Based on Histogram Feature and

Particle Swarm Optimization‟‟ 2009 Fifth

ICONC.

1

“Multisegmentation through wavelets:

Comparing the efficacy of Daubechies vs

Coiflets"

Madhur Srivastava,Member, IEEE, Yashwant Yashu, Member, IETE, Satish K. Singh,Member, IEEE,

Prasanta K. Panigrahi

Abstract--- In this paper, we carry out a

comparative study of the efficacy of wavelets

belonging to Daubechies and Coiflet family in

achieving image segmentation through a fast

statistical algorithm.The fact that wavelets

belonging to Daubechies family optimally capture

the polynomial trends and those of Coiflet family

satisfy mini-max condition, makes this

comparison interesting. In the context of the

prseent algorithm, it is found that the

performance of Coiflet wavelets is better, as

compared to Daubechies wavelet.

Keywords: Peak Signal to Noise Ratio,

Segmentation, Standard deviation, Thresholding,

Weighted mean.

Madhur Srivastava is final year B.TECH student in the

Department of Electronics and Communication Engineering at Jaypee University of Engineering and Technology, Guna, India;

e-mail: [email protected]

Yashwant Yashu is final year B.TECH student in the Department of Electronics and Communication Engineering at

Jaypee University of Engineering and Technology, Guna, India;

e-mail: [email protected] Satish K. Singh is Assistant Professor in the Department of

Electronics and Communication Engineering at Jaypee University

of Engineering and Technology, Guna, India e-mail: [email protected]

Prasanta K. Panigrahi is Professor in the Department of Physics

at Indian Institute of Science Education and Research, Kolkata, India; (Phone No. +91-9748918201) e-mail:

[email protected]

I. INTRODUCTION

Thresholding of an image is done to reduce the

storage space, increase the processing speed and

simplify the manipulation as less number of levels

are there compared to 256 levels of normal image.

Primarily, thresholding are of two types – Bi-level

and Multi-level [1].

Bi-level thresholding consists of two values – one

below the threshold and another above it. While in

Multilevel thresholding, different values are assigned

between different ranges of threshold levels. Various

thresholding techniques have been categorized on the

basis of histogram shape, clustering, entropy and

object attributes [2].

Wavelet Transform is very significant tool in the

field of image processing. The wavelet transform of

an image comprises four components –

Approximation, Horizontal, Vertical and Diagonal.

The process is recursively used in approximation

component of wavelet transform for farther

decomposition of image until only one coefficient is

left in approximation part [3-5].

As is well known, Daubechies family are useful in

extracting polynomial trends through their low-pass

coefficients satisfying vanishing moments conditions:

. 0n

j kx dx

(1)

This is due to the fact that, the wavelets of

1

, 2 2j

j k ix k (2)

For n ≤ N , the values of n depend on the particular of

this Daubechies basis makes them well suited for

isolating smooth polynomial features in a given

image. The Coiflet coefficient on the other hand ,

satisfy the mini -max condition, i.e, the maximum

2

error in extracting local features is minimized in this basis set. Hence, it is worth comparing behavior of

Fig. 1. Block diagram of the approach used.

the corresponding wavelet at low-pass coefficients from the perspective of the proposed algorithm.

I. APPROACH

The thresholding applied in wavelet domain takes

into account that majority of coefficients lie near to

zero and coefficients representing large differences

are few at the extreme ends of histogram plot for

each horizontal, vertical and diagonal component.

The coefficients with large differences represent

most significant information of the image. Hence,

the procedure provides for variable size

segmentation with bigger block size around the

mean, and having smaller blocks at the ends of

histogram plot[6]. Following is the methodology

used as shown in Fig. 1

1. Segregate the color image into its Red, Green

and Blue components.

2. Take 2D-wavelet transform of each component

at any level. Do the following for each

Horizontal, Vertical and Diagonal part for every

Red, Green and Blue component.

Threshold the coefficients using

weighted mean and variance of each

sub-band of histogram of coefficients.

Thresholding is done by having

broader block size around mean while

finer block size at the end of histogram.

3. Take inverse wavelet transform for each

component.

4. Reconstruct the image by concatenating Red,

Green and Blue components.

III. RESULTS AND OBSERVATIONS

The proposed algorithm is tested on variety of

standard images using Daubechies and Coiflet

wavelets. The results of PSNR and size of

reconstructed image at different threshold levels are

shown in Table 1. The numbers of threshold levels

taken are 3, 5 and 7. Figure 2 shows the graph of

PSNR w.r.t threshold levels of the image of Lenna

.

Table 1. PSNR and size of reconstructed images using different Daubechies and Coiflet wavelets.

Image

Name

Threshold

Level

Wavelet Name

dB2 dB4 dB6 dB8 coif1 coif2 coif3 coif4 coif5

Lenna 3 PSNR(dB) 34.45 35.19 35.52 35.71 34.50 35.23 35.48 35.61 35.69

Size(kB) 36.2 36.5 36.3 36.2 36.4 36.2 36.4 36.3 36.3

5 PSNR(dB) 36.41 37.13 37.41 37.53 36.5 37.19 37.42 37.54 37.62

Size(kB) 36.2 36.5 36.3 36.3 36.3 36.3 36.4 36.4 36.4

7 PSNR(dB) 36.79 37.5 37.74 37.88 36.84 37.53 37.76 37.89 37.97

Size(kB) 36.2 36.5 36.3 36.3 36.3 36.3 36.4 36.4 36.4

Baboon 3 PSNR(dB) 25.92 26.31 26.29 26.19 25.94 26.20 26.29 26.33 26.36

Size(kB) 74.4 74.2 74.2 74.3 74.4 74.4 74.3 74.3 74.2

5 PSNR(dB) 27.06 27.56 27.44 27.40 27.13 27.41 27.50 27.55 27.58

Size(kB) 74.4 74.1 74.2 74.2 74.3 74.2 74.2 74.2 74.1

A H

D

V

T

H

R

E

S

H

O

L

D

I

N

G

R Component R Component

G Component

B Component B Component

G Component

3

7 PSNR(dB) 27.18 27.70 27.57 27.53 27.27 27.53 27.62 27.67 27.71

Size(kB) 74.3 74.1 74.1 74.1 74.2 74.2 74.1 74.2 74.1

Pepper 3 PSNR(dB) 30.63 33.87 31.61 31.25 31.48 31.63 31.70 31.75 31.77

Size(kB) 39.9 39.8 40.3 40.4 40.1 40.3 40.3 40.2 40.2

5 PSNR(dB) 34.12 35.83 34.61 34.30 33.98 34.41 34.55 34.60 34.62

Size(kB) 39.5 39.7 39.6 39.6 39.6 39.6 39.7 39.7 39.7

7 PSNR(dB) 34.56 36.26 34.92 34.58 34.25 34.73 34.88 34.93 34.95

Size(kB) 39.5 39.8 39.5 39.6 39.6 39.6 39.6 39.6 39.7

Fig. 2 Plot of PSNR vs Threshold levels thresholded using different wavelets of Lenna image

IV. CONCLUSION

Thresholding performed by proposed algorithm gives

better PSNR using coiflet wavelets compared to

Daubechies wavelets while keeping the size of

reconstructed image almost same. This is due to the

unique property of coiflet wavelets satisfying the

mini-max condition. Hence, it can be concluded that

coiflet wavelets provides best and most desirable

results during multilevel thresholding of image in

wavelet domain. In future works, the proposed

algorithm using coiflet wavelets can be used for

image segmentation, object separation, image

compression and image retrieval because only few

coefficients of Horizontal, Vertical and Diagonal

components represent the entire variation of image

without deteriorating the quality.

REFERENCES

1. R.C. Gonzales, R.E. Woods, “Digital Image

Processing,” (2ed., PH, 2001).

2. M. Sezgin, B. Sankur, Survey over image

thresholding techniques and quantitative

performance evaluation, Journal of Electronic

Imaging, 13(1) (2004) 146-165.

4

3. S.G. Mallat, A Wavelet Tour of Signal

Processing. New York: Academic (1999).

4. Daubechies, Ten Lectures on Wavelets, Vol. 61

of Proc. CBMS-NSF Regional Conference

Series in Applied Mathematics, Philadelphia, PA:

SIAM (1992).

5. J.S. Walker,” A Primer on Wavelets and Their

Scientific Applications,” 2nd ed. Chapman &

Hall/CRC Press, Boca Raton, FL, 2008.

6. M. Srivastava, P. Katiyar, Y. Yashu, S.K. Singh,

P.K. Panigrahi,” A Fast Statistical Method for

Multilevel Thresholding in Wavelet Domain,”

unpublished.

http://arxiv.org/find/nlin/1/au:+Srivastava_M/0/1/0/all/0/1

http://arxiv.org/find/nlin/1/au:+Katiyar_P/0/1/0/all/0/1

http://arxiv.org/find/nlin/1/au:+Yashu_Y/0/1/0/all/0/1

http://arxiv.org/find/nlin/1/au:+Singha_S/0/1/0/all/0/1

http://arxiv.org/find/nlin/1/au:+Panigrahi_P/0/1/0/all/0/1

1

Analysis of Signals in Fractional Fourier Domain

Ajmer Singh, Student of Lovely Professional University(LPU)-India, Nikesh Bajaj, Asst. Prof., ECE Dept.(LPU) [email protected], [email protected]

Abstract- Fractional Fourier Transform (FRFT) is the

generalization of the classic Fourier Transform (FT). When we dealing with time-varying signals, FRFT is an important tool to analysis these signals. This paper contain the results for variation of basic signals like Rectangular pulse, sine wave and Gaussian signal in the Fractional Fourier Domain (FRFD). The correlation results for FRFD signal to the time domain(TD) and correlation results for FRFT at α-domain to the FRFT at (α-1)-domain also shown and discussed. The graphically proof of scaling property of FRFT is also given.

Index Terms— FRFT, FRFD, Signal Processing, α-domain, Analysis FRFT, FRFT scaling property, α-domain’s correlation

I. INTRODUCTION

The FT is one of the most frequently used tools in signal analysis. However, the FT is not very suitable in dealing with signals whose frequency changes with time because of its assumption that the signal is stationary. The generalization of FT has been proposed in [1] by V. Namias, and is known as FRFT. FRFT also state as “FRFT perform a spectral rotation of the signal in time-frequency plane with variation of α parameter”. In recent years, FRFT has been applied in many areas such as solving differential equations [2], quantum mechanics [1], optical signal processing [6], time variant filtering and multiplexing [3]–[5], swept-frequency filters [6]. Several properties of the FRFT in signal analysis have been summarized in [6].

This paper is divided as following sections. Section II is about the basic concept of FRFT, and also discussed about some properties of FRFT. Section III is about the analysis of different signals, in this section we discussed about the Rectangular pulse, Sine wave and Gaussian signal, also check correlation results for these signals in FRFD. In section IV the conclusion of the paper is discussed.

II. BASIC CONCEPT OF FRFT

The FRFT with angle parameter α of a signal f(t) is defined as,

�� !�∞

�∞"�

# $�%� & '� � ��%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%# $�%� � �'� � ��%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%# $�%� � ��' ( �� Fα(u) called as the α- order FRFT of signal f(t). Where α

= Aл/2, and ‘A’ is a real number and is called the order of the FRFT, which is in interval [-2, 2] and can be extended to any real number according to A+4k = A. Where k is any integer like [….-3, -2, -1, 0, 1, 2, 3,….], and A can be any fractional value in interval [-2, 2].

Some basic properties of FRFT are:

• Linearity. • Zero rotation/ Time domain. When A = 0 or 4; α = 0 or 2л; the FRFT operator Fα(u) is

correspond to identity operator. Or F0(u) = f(t). Where f(t) is the time domain signal and F0(u) is the FRFT operator at α=0. • FT is the special case of FRFT. When A = 1; α = л/2; the FRFT operator Fα(u) is

correspond to FT. Or Fл/2(u) = F(t). Where F(t) stand for the Fourier Transform of the time domain signal f(t) and Fл/2(u) is the FRFT operator at α = л/2. • Flipped operation/ time inversion. When A = 2; α = л; the FRFT operator Fα(u) is

correspond to flipped operator. Or Fл(u) = f(-t). Where f(-t) is the flipped version of the input signal f(t) and Fл(u) is the FRFT operator at α = л. • Inverse Fourier Domain. When A = 3; α = 3л/2; the FRFT operator Fα(u) is

correspond to inverse Fourier domain. Or F3л/2(u) = F(-t). Where F(-t) stand for the flipped version of Fourier Transform of the time domain signal f(t) and F3л/2(u) is the FRFT operator at α = 3л/2.

The above properties of FRFT are easily understood by the figure 1.

Figure 1: Time- frequency plane for FRFT.

In this paper, we use the Digital Computation method of

the FRFT which is given in [7].

III. ANALYSIS OF DIFFERENT SIGNALS

We always store our information or data in some type of memory space, that set of information or data is known as signal. There are some basic signals like Rectangular pulse, Sine wave, Gaussian signals. These signals are basically use for signal processing. In signal processing there are different types of transform techniques which are used to analysis the



2

frequency spectrum of the signals. Because the frequency spectrum tells more about the signal behavior as compare to the time domain representation.

But the FRFT tell about the signal representations in time domain and frequency domain while using the different FRFT operator Fα(u), where α = 0; give the time domain representation and α = л/2; give the frequency domain representation. Also 0 < α < л/2 give the intermediates domain which are known as α–domains. These domains are not giving any exact information about the time / frequency component. But gives some mixed information about that.

So, in this section we are going to discuss about the variation of some signals with variation in α-domain.

A. Analysis of Rectangular pulse in FRFD The rectangular pulse (also known as the rectangle

function, rectangular function, gate function, or unit pulse) is defined as:

)�*�� %�%%%%%%%%$�%+�+ , �-� %%%%%%%%%%%%%%� %.%%%%%%%%$�%+�+ / �-�

And FT of a rectangular function is defined as: �01)�*��2 � 345%��-�� -��6

Now, let us discuss an example for rectangular pulse in

FRFD and discuss results. In figure 2 we shows the results

(a) A=0/α=0 (b) A=0.2/α=л/10

(c) A=0.4/α=л/5 (d) A=0.6/α=3л/10

(e) A=0.8/α=2л/5 (f) A=1/α=л/2

Figure 2: FRFT of rectangular pulse for different values of angle α/A. solid line: real part. Dashed line: imaginary part.

for six different FRFD values, out of which figure 2(a) for

α = 0; which shows the rectangular pulse in time domain and figure 2(f) for α = л/2; which shows the spectrum of rectangular pulse that is sinc function, rest of the four

domains shows the FRFT results for rectangular pulse at α = л/10, л/5, 3л/10, and 2л/5.

Two FRFDs for rectangular pulse at α=0 and α= л/2 are ordinary time and frequency domains respectively. By taking a look on figure 2(a) to 2(e) any one can easily understand the concept that how an rectangular pulse become a sinc function in frequency domain, without any mathamaticaly expression. We can also see how much these domains are correlated to each other. But not tell the actual value of correlation cofficent. To analysis this in figure 3 there are two graphs first one of which tells about the normalized correlation value of α-domain signal to the time domain signal and second one tells about the normalized correlation value of α-domain to (α-1)-domain. For the better results we take 90 domains at 90 different values of A between 0 < A < 1.

In figure 3(a) and 3(b) where the α = 0 correlation coefficent has the maxium value is 1. It proof that the FRFT at α = 0 give the actual time domain signal or no rotation. But when there is a small change of 1° (one degree) in α value the correlation coefficent give the minimum value quite different from time domain signal but still correlate up to 95%, and so on. In figure 3(b) we can see that when 1° < α < 45° then the α-domain signal is highly corrrelated to the previous α-domain, an simillar result for 45°< α < 90°.

(a)

(b)

Figure 3: Correlation results for rectangular pulse.

B. Analysis of Sine wave in FRFD The sine wave or sinusoid is a mathematical function that

describes a smooth repetitive oscillation. It occurs often in pure mathematics, as well as physics, signal processing,

-30 -20 -10 0 10 20 300

0.2

0.4

0.6

0.8

1

-30 -20 -10 0 10 20 30-0.5

0

0.5

1

1.5

-30 -20 -10 0 10 20 30-0.5

0

0.5

1

1.5

-30 -20 -10 0 10 20 30-0.5

0

0.5

1

1.5

-30 -20 -10 0 10 20 30-0.5

0

0.5

1

1.5

-30 -20 -10 0 10 20 30-0.5

0

0.5

1

1.5

0 20 40 60 80 1000.8

0.85

0.9

0.95

1

value of α in degrees

MA

X(C

orre

latio

n)

Correlation of α domain signal to time domain signal

0 20 40 60 80 1000.95

0.96

0.97

0.98

0.99

1Correlation of α domain signal to (α-1) domain signal


MA

X(C

orre

latio

n)

3

electrical engineering and many other fields. It’s most basic form as x(t) known as a function of time (t) is defined as:

7�� 8 345�� ( 9� Where M is the amplitude of the sine wave, f is the

frequency component, t is time and θ is the phase, specifies where in its cycle the oscillation begins at t = 0.

Now, let discuss the results for Sine wave in FRFD. In figure 4 we shows the results for six α’s values, out of which figure 4(a) for α = 0; which shows the Sine wave in time domain and figure 4(f) for α = л/2; which shows the spectrum of Sine wave that is impulse function, rest of the four domains shows the FRFT results for Sine wave at α = л/10, л/5, 3л/10, and 2л/5.

As similar to the results discuss in section 3(A), now in figure 4 shows the six α-domains for Sine wave out of which two domains are identical to the ordinary time domain and frequency domain, which are in figure 4(a) and 4(b) respectively. Rest four figures 4(b), 4(c), 4(d) and 4(e) shows the results for FRFT of Sine wave at different value of α. The correlation results for Sine wave in the α-domain with the TD signal, and with the (α-1)-domain signal are shown in figure 5(a) and 5(b) respectively.

(a) A=0/α=0 (b) A=0.2/α=л/10

(c) A=0.4/α=л/5 (d) A=0.6/α=3л/10

(e) A=0.8/α=2л/5 (f) A=1/α=л/2

Figure 4: FRFT of Sine wave for different values of angle α/A. solid line: real part. Dashed line: imaginary part.

In figure 5(a) it is clear that when 1° < α <10° then the α-

domain signal for sine wave is somehow correlated to the TD signal. But when 10° < α < 90° then the α-domain signal

is not correlated to TD signal, for these domains the correlation coefficient values tends to zero.

In figure 5(b) the correlation results for α-domain to (α-1) domain are shows. When 1°< α < 90°, these domain are equally correlated to each other. But very less correlated to TD signal

(a)

(b)

Figure 5: Correlation results for Sine wave in α-domain.

C. Analysis of Gaussian signal in FRFD A Gaussian signal has a bell-shaped curve. Gaussian

tuning curves are extensively used because their analytical expression can be easily manipulated in mathematical derivations. Mathematically Gaussian signal defined as:

7�� -�

As we discuss two type of signals in section 3(a) and 3(b) which are Rectangular pulse and Sine wave respectively. The third point of interest is Gaussian signal. Because Gaussian functions are widely used in statistics where they describe the normal distributions, in signal processing where they serve to define Gaussian filters, and many more application they have.

At last, by taking an example of Gaussian signal to compute the FRFT for analysis it in FRFDs. In figure 6 we show six different FRFDs for Gaussian signal. Out of which two domains are again identical to TD and FD. And rest four domains are intermediate domains of TD and FD.

For Gaussian signals the Fourier transform is again a Gaussian signals. Now if we have a look from 6(a) to 6(f) then the variation from TD to FD is easily understandable. Our point of interest is that how FRFD signals are correlated to each other. For this in figure 7 we have two plots which show the correlation of α-domain signal with TD signal and with (α-1)-domain signal in figure 7(a) and 7(b) respectively.

0 20 40 60 80 100-1

-0.5

0

0.5

1

0 20 40 60 80 100-1.5

-1

-0.5

0

0.5

1

1.5

0 20 40 60 80 100-1.5

-1

-0.5

0

0.5

1

1.5

0 20 40 60 80 100-1.5

-1

-0.5

0

0.5

1

1.5

0 20 40 60 80 100-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0 20 40 60 80 100-5

0

5FRFT of Sine Wave A= 1

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

1.2

1.4


MA

X(C

orre

latio

n)


0 20 40 60 80 1000.96

0.97

0.98

0.99

1

1.01

1.02Correlation of α domain signal to (α-1) domain signal


MA

X(C

orre

latio

n)

4

(a) A=0/α=0 (b) A=0.2/α=л/10

(c) A=0.4/α=л/5 (d) A=0.6/α=3л/10

(e) A=0.8/α=2л/5 (f) A=1/α=л/2

Figure 6: FRFT of Sine wave for different values of angle α/A. solid line: real part. Dashed line: imaginary part.

(a)

(b)

Figure 7: Correlation results for Gaussian signal in α-domain.

By analyzing these three signals in FRFDs. It is clear that the α-domain signal are highly correlated to the (α-1)- domain. That can understand from figure 3(b), 5(b) and 7(b). These figures show that for the interval 1° < α < 90° these signals are similar to each other. And by taking a look to figure 2(b) to 2(e), 4(b) to 4(e) and 6(b) to 6(e), we can realized that the FRFDs signal are just scaled version of the previous FRFD.

IV. CONCLUSION

We have discussed about FRFT concept and its some properties. Also we have discussed the behavior of three different signals in FRFD, and presented these signal in FRFD. The correlation concept have discussed for α-domain signal to TD signal and to (α-1) domain signal. That shows that the α-domain signal is just the scaled version of the previous α-domain signals. That graphically proofs the scaling property of FRFT which is discussed in [6].

The work presented in this paper is helpful for further research. And the graphically proof of the scaling property of FRFT is helpful to understand that how FRFT change the time domain signal to the frequency domain signal.

REFERENCES

[1] V. Namias, “The fractional order Fourier transform and its application to quantum mechanics,” J. Inst. Math. Applicat., vol. 25, pp. 241–265, 1980.

[2] A. C. McBride and F. H. Kerr, “On Namias’ fractional Fourier transforms,” IMA J. Appl. Math., vol. 39, pp. 159–175, 1987.

[3] H. M. Ozaktas, B. Barshan, D. Mendlovic, and L. Onural, “Convolution, filtering, and multiplexing in fractional Fourier domains and their relationship to chirp and wavelet transforms,” J. Opt. Soc. Amer. A, vol. 11, pp. 547–559, Feb. 1994.

[4] R. G. Dorsch, A. W. Lohmann, Y. Bitran, and D. Mendlovic, “Chirp filtering in the fractional Fourier domain,” Appl. Opt., vol. 33, pp. 7599–7602, 1994.

[5] A. W. Lohmann and B. H. Soffer, “Relationships between the Radon–Wigner and fractional Fourier transforms,” J. Opt. Soc. Amer. A, vol. 11, pp. 1798–1801, June 1994.

[6] L. B. Almeida, “The fractional Fourier transform and time-frequency representation,” IEEE Trans. Signal Processing, vol. 42, pp. 3084–3091, Nov. 1994.

[7] Haldun M. Ozaktas, Orhan Arikan, M. A. Kutay and G. Bozdag, “Digital Computation of the Fractional Fourier Transform”, IEEE Trans. Signal Processing vol. 44, pp. 2141-2150, Sept. 1996.

Ajmer Singh (M’22) was born in Punjab, India. He is pursuing the master’s degree in signal processing from Lovely Professional University, Punjab, India, in 2011. Currently, he is doing dissertation under the supervision of Mr. Nikesh Bajaj, the assistant professor of electronic department. Research interests include different aspects of FRFD filter designing.

Nikesh Bajaj received his bachelor degree in Electronics & Telecommunication from Institute of Electronics And Telecommunication Engineers. And he received his master degree in Communication & Information System from Aligarh Muslim University, India. Now, he is working in LPU as Asst. Professor, Department of ECE. Research interests include Cryptography, Cryptanalysis,

and Signal & Image Processing.

-100 -50 0 50 1000

0.2

0.4

0.6

0.8

1

-100 -50 0 50 100-1

-0.5

0

0.5

1

1.5

-100 -50 0 50 100-1

-0.5

0

0.5

1

1.5

-100 -50 0 50 100-1

-0.5

0

0.5

1

1.5

-100 -50 0 50 100-1.5

-1

-0.5

0

0.5

1

1.5

2

-100 -50 0 50 100-1

0

1

2

3

4

0 20 40 60 80 100

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1


MA

X(C

orre

latio

n)


0 20 40 60 80 1000.997

0.9975

0.998

0.9985

0.999

0.9995

1

1.0005Correlation of α domain signal to (α -1) domain signal


MA

X(C

orre

latio

n)

1

Parzen-Cos6 (πt) combinational window family based QMF bank

Narendra Singh (*)

and Rajiv Saxena,

Jaypee University of Engineering and Technology, Raghogarh, Guna (MP)

(*) Corresponding Author: [email protected] ; [email protected]

ABSTRACT

A new approach for the design of prototype

FIR filter of two-channel Quadrature Mirror Filter

(QMF) bank is introduced. Three variable windows,

viz., Blackman window, Kaiser window, and

Parzen-cos6 (πt) (PC6) window are used to design

prototype filters. The design equations of these

window functions based filter banks are also given

in this article. Reconstruction error, which is used

as an objective function, is minimized by optimizing

the cutoff frequency of designed prototype filters.

The Gradient based iterative optimization

algorithm is used. The performances of filter banks

designed with these window functions are compared

on the basis of reconstruction error. The

combinational window, PC6 window provides the

QMF bank with better reconstruction error.

Keywords: QMF, Filter Bank, Combinational

Window.

1. INTRODUCTION

Window functions widely used in digital signal

processing for the applications in signal analysis

and estimation, digital filter design and speech

processing. Digital filter banks are used in a number

of communication applications. The theory and

design of QMF bank was first introduced by

Johnston [1]. These filter banks find wide

applications in many signal processing fields such

as trans-multiplexers [2]-[3], equalization of

wireless communication channel [4], sub-band

coding of speech and image signals [5]-[8], sub-

band acoustic echo cancellation [9]-[12]. In QMF bank the input signal x(n) splits into

two sub-band signals having equal bandwidth using the low pass and high pass analysis filter H0 (z) and

H1(z) respectively. These sub-band signals are down sampled by factor of two to reduce processing complexity. At the output, corresponding synthesis bank has two-fold interpolator for both sub-band signals, followed by G0(z) and G1(z) synthesis filters. The outputs of the synthesis filters are combined to obtain the reconstructed signal y(n). This reconstruction of signal at output is not perfect replica of the input signal x(n), due to three types of errors: aliasing error, amplitude error and phase error [12]-[13]. Since inception of the QMF banks most of the researchers giving main stress on the elimination or minimization of these errors and obtain near perfect reconstruction (NPR). In several design methods [14]-[18] aliasing and phase distortion has been eliminated completely by designing all the analysis and synthesis FIR linear phase filters by a single low pass prototype even order symmetric FIR linear phase filter. Amplitude distortion is not possible to eliminate completely, but can be minimized using optimization techniques [12]-[13]. Figure-1 shows the two - channel quadrature mirror filter bank designed by Johnston [1] in which Hanning window was used to design low pass prototype FIR filter and nonlinear optimization technique to minimize reconstruction error was employed.

This paper uses the algorithm as proposed in

Creusere and Mitra [6] with certain modifications to

optimize the objective function. The combinational

window functions [19]-[21] with large SLFOR have

been devised and used for designing FIR prototype

filters. Due to the closed form expressions of the

window functions, the optimization procedure gets

simplified. Finally, a comparative evaluation has

been done with reconstruction error and far-end

attenuation being selected as the main figure of

merit.



2

Fig. 1 Two - channel quadrature mirror filter

bank

2. FILTER DESIGN USING WINDOW

TECHNIQUE

The most straightforward technique for designing

FIR filters is to use a window function to truncate

and smooth the impulse response of an ideal zero-

phase infinite-impulse-response filter. This impulse

response can be obtained by using the Fourier series

expansion.

The impulse response of the ideal low pass

filter with cutoff frequency ωc is given as-

, sin( )

n cn

nhid n

(1)

hid(n) is doubly infinite , not absolutely

summable and therefore unrealizable [15]. Hence

shifted impulse response of hid(n) will be-

sin( ( 0.5 ))

, 0(

n N-0.5 )

1cid

n Nh

n Nn (2)

For making a causal filter, direct truncation

of infinite-duration impulse response of a filter

results in large pass band and stop band ripples near

transition band. These undesired effects are well

known Gibbs phenomenon. However, these effects

can be significantly reduced by appropriate choice

of smoothing function w (n). Hence, a filter p (n) of

order N is of the form [15-17]-

idn n np h w (3)

where, w(n) is the time domain weighting

function or window function. Window functions are

of limited duration in time domain, while

approximates band limited function in frequency

domain.

3. WINDOW FUNCTIONS

The window functions used in designing the

prototype FIR filter for the QMF banks are given in

Table-1.The Table-1includes the expressions of

variable window functions, expressions of variables

(β, γ: which are defining the window families) and

expressions of window shape parameters (D) of

Kaiser, PC6 and Blackman window. The filter

designed using one of the above window functions

is specified by three parameters cut-off frequency

(fc), filter order (N), and window shape parameter

( ). For desired stop

band attenuation (ATT) and transition bandwidth,

the order of the filter (N) can be estimated by

1,

DN

Fs

(4)

where, D is window shape parameter, ΔFs,

the normalized transition width=(fs-fp)/Fs, and Fs is

the sampling frequency in Hertz. The window shape

parameter can be determined by the desired stop

band attenuation.

3

Table 1: Window Functions with Filter Design Equations

Sr.

No

.

Name of window Expression for

Expression for Window

function

Window variable Window shape parameter

1. Blackman window 2 40.42 0.5cos 0.08cos

for M n M

n nM M

w n

2. Kaiser window 210

,( )

0

for M n M

I n Mw n

I

21 21 ,

8.7 ,

0, 21

0.5842 0.07886 f 21 50

0.1102 50

ATT ATT

ATT

for ATT

or ATT

forATT

,

0.9222, 21

( 7.95) 21

14.36

for ATT

D ATTfor ATT

3. Parzen- cos6 (πt)

combinational

window (PC6):

1 , 2 0 3.7

0, 26

l n d n n N

w n NPC

2

3

1 24 1 2 , 4

2 1

( )

, 2 4 2

l

n nn N

N N

nN n N

N

n

6( ) ( / ), 2d n cos n N n N

2a b ATT c ATT

for 30.32≤ATT≤51.25

a=8.15414,b= -

0.236709,c=0.00218617

for 51.25<ATT≤68.69

a=21.269,b= -0.605789,c=0.00434808

2D a b ATT c ATT

for 30.32 ≤ ATT ≤ 43.60 a = 1.82892, b= - 0.027548,

c = .00157699

for 43.6 < ATT ≤ 49.44

a = 1.67702, b = 0.0450505,

c = 0.00000

for 49.44 <ATT ≤57.48

a = 85.4733, b = -3.419690,

c = 0.03578400

for 57.48<ATT≤38.69

a = -8.60006, b= 0.477004,

c = -0.00355655

4

4. OPTIMIZATION ALGORITHM

The amplitude distortion in reconstructed

signal can be minimized by optimization

techniques. The gradient based iterative

optimization algorithm is described in this section.

a. Objective Function

To get the high-quality reconstructed output y(n),

the frequency response of low pass prototype filter,

H(ej2πf

), must satisfy the following [13]- 2 42 2 2| | | | 1 4 , 0 /

j f Ff sH e H e for f Fs

(5)

2| | 0 / 4

fH e forf Fs (6)

by assuming that filters have even number

of coefficients.

By satisfying exactly (5) the aliasing error is

eliminated between nonadjacent bands. Similarly,

the amplitude distortion is eliminated by satisfying

(6) [11]. Phase distortion is removed by selecting

even-order FIR prototype filter [1, 4]. Constraints

(5) and (6) cannot be satisfied exactly for finite

length filter order, so it is necessary to design a

filter which approximately satisfies (5) and (6).

Johnston [1] combined the pass band ripple energy

and out-of-band energies into a single cost function

having nonlinear nature and then minimized it using

Hooke and Jeaves algorithm [25]. Creusere and

Mitra [11] designed filters using Parks–McClellan

algorithm that approximately satisfied (5) and (6).

The filter length, relative error weighting, and stop

band edge were fixed before optimization procedure

started, while the pass-band edge was adjusted to

minimize the objective function ε.

2 /4)2 ( (2 2| ( | | ( | 1, 0 / 4

f Fj f j smax H e H e for f Fs

(7)

b. Algorithm

A gradient based linear optimization

algorithm (as given in Annexure-1) is used to adjust

the cutoff frequency. Filter design parameters and

optimization control parameters like step size (step),

target error (terror), direction (dir) and previous

error (prev-error) are initialized. Prototype filter is

designed using windowing technique. With each

iteration, fc of p(n) and reconstruction error (error)

is computed, which is also the objective function. If

the error increases in comparison to previous

iteration (prev-error), step size (step) is halved and

the search direction (dir) is reversed. This step size

and direction is used to re-compute fc for new

prototype filter. The optimization process is halted

when the error of the current iteration is within the

specified tolerance (depicted as t-error), which is

initialized before the optimization process begins or

when prev-error equals error [26].

5. PERFORMANCE ANALYSIS OF QMF

QMF banks were designed by using window

functions described in Table-1 and optimization

algorithm in Annexure-1. In these design examples

the stop-band edge frequency and pass-band edge

frequency are taken as Fs/4 and Fs/6 respectively. In

Table 2, the value of stop band attenuation was kept

at 50 dB, resulting in different filter orders for

different window functions, which clearly indicates

the improvement in reconstruction error is obtained

with PC6 window function.

In Table-3, the results corresponding to filter

order are shown. In Table-4, a comparison is made

of the optimum performance that can be attained

with the three window functions. Apart from the

reconstruction error, the far-end attenuation

(amplitude of the last ripple in the stop band) is also

selected as one of the figures of merits for the

comparative study. This parameter is of significance

when the signal to be filtered has great

concentration of spectral energy. In a sub-band

coding, the filter is intended to separate out various

frequency bands for independent processing. In the

case of speech, e.g. the far-end rejection of the

energy in the stop band should be more so that the

energy leak from one band to other will be

minimum. As the stop band attenuation increases

the value of reconstruction error decreases. It is

evident from table-2 and table-3.The PC6 window-

designed FIR filter gives better performance as

compared to the other window functions.

5

Table 2: Performance of QMF filter at 50 dB stop-band attenuation

Table 3: Optimum performance in terms of order

Window

function

Reconstructi

on error

(dB)

Stop-band

attenuation (db)

Filter

order (N)

Far-end

attenuatio

n

(dB)

Blackman

window

0.0049 108 86 102

Kaiser

window

0.0097 88.00 90 107

PC6 window 0.0120 55.00 22 72

Table 4: Performance in terms of far-end attenuation

Window

function

Reconstructi

on error

(dB)

Stop-band

attenuation (db)

Filter

order (N)

Far-end

attenuatio

n

(dB)

Blackman

window

0.0785 52.168 45 56

Kaiser

window

0.0473 52.168 37 66

PC6 window 0.0290 52.168 73 73

5. CONCLUSION

A simple algorithm for designing the low pass

prototype filters for QMF banks has been used to

optimize the reconstruction error by varying the

filter cut-off frequency. Prototype filters designed

using high SLFOR combinational window, Kaiser

window and Blackman window functions have been

compared. Combinational window functions provide

better far-end rejection of the stop-band energy. This

feature helps to reduce the aliasing energy leak into a

sub-band from that of the signal in the other sub-

band.

References

1. Johnston, J. D.: A filter family designed for

use in quadrature mirror filter banks. In:

Proceedings of IEEE International

Conference Acoustics, Speech and Signal

Processing, Denver, 291–294(1980)

2. Bellanger, M.G., Daguet, J.L.: TDM-FDM

transmultiplexer: Digital Poly phase and

FFT. IEEE Trans. Commun. 22(9) ,1199-

1204 (1974)

3. Vetterly,M.: Prefect transmultiplexers. In:

Proceedings of IEEE International

Window

function

Reconstruction error

(dB)

Filter order

(N)

Far-end

attenuation

(dB)

Blackman

window

0.6509 105 85

Kaiser

window

0.3208 90 107

PC6 window 0.1060 22 72

6

Conference on Acoustics, Speech, and Signal

Processing, vol. 4, 2567- 2570 (1986).

4. Gu, G., Badran, E.F.: Optimal design for

channel equalization via the filter bank

approach. IEEE Trans. Signal Process.52

(2),536-544 (2004)

5. Esteban, D., Galand, C.: Application of

quadrature mirror filter to split band voice

coding schemes. In: Proceedings of IEEE

International Conference on Acoustics,

Speech, and Signal Processing (ASSP), 191-

195(1977)

6. Crochiere, R.E.: Sub–band coding. Bell Syst.

Tech. J., 9, 1633-1654(1981)

7. Vrtterli, M.: Multidimensional sub-band

coding: Some theory and algorithm, Signal

Process 6, 97- 112(1984)

8. Woods,J.W.,Neil,S.D.O.:Sub-band coding of

images. IEEE Trans Acoustic. Speech and

Signal Process. (ASSP)-34 (10), 1278-

1288(1986)

9. Liu,Q.G.,Champagne,B.,Ho,D.K.C.:Simple

design of over sampled uniform DFT filter

banks with application to sub-band acoustic

echo cancellation. Signal Process, 80(5),831-

847(2000)

10. Crochiere,R.E., Rabiner , L. R.: Multirate

digital signal processing. Prentice–

Hall(1983)

11. Creusere, C.D., Mitra, S.K.: A simple

method for designing highquality prototype

filters for M band pseudo QMF banks. IEEE

Trans. Signal Process. 43(4), 1005–1007

(1995)

12. Mitra, S.K.: Digital signal processing: A

computer based Approach, TMH,

ch.7&10(2001)

13. Vaidyanathan, P.P.: Multirate systems and

filter banks. Prentice- Hall, Englewood

Cliffs, NJ (1993)

14. Jain, V.K., Crochiere,R.E.: Quadrature

mirror filter design in time domain. IEEE

Trans, Acoustic,. Speechand Signal Process.

ASSP- 329 (4), 353-361(1984)

15. H. Xu, W.S. Lu, A. Antoniou, “An improved

method for design of FIR quadrature mirror

image filter banks. IEEE Trans. Signal

Process. , 46 (6), 1275-1281(1998)

16. Goh, C. K., Lim, Y. C.: An efficient

algorithm to design weighted minimax PR

QMF banks. IEEE Trans. Signal

Process.47(12), 3303-3314)(1999)

17. Chen, C.K., Lee J.H.: Design of quadrature

mirror filters with linear phase in frequency

domain. IEEE Trans Circuits System, 39 (9),

593-605(1992)

18. Lin, Yuan-Pei, Vaidyanathan, P. P.: A Kaiser

window approach for the design of prototype

filters of cosine modulated filterbanks. IEEE

Signal Processing Lett., 5, 132–134 (1998).

19. Saxena, R.: Synthesis and characterization of

new window families with their applications,

Ph. D. Thesis, Electronics and Computer

Engineering Department, University of

Roorkee, Roorkee, India (1997).

20. Sharma, S. N., Saxena, R., Jain, A.: FIR

digital filter design with Parzen and cos6 (πt)

combinational window family, Proc. Int.

Conf. Signal Processing, Beijing, China,

IEEE Press, 92–95 (2002).

21. Sharma, S. N., Saxena, R., Saxena, S. C.:

Design of FIR filters using variable window

families – A comparative study, J. Indian

Inst. Sci., 84, 155–161 (2004).

22. DeFatta, D. J., Lucas J. G., Hodgkiss, W. S.

Digital signal processing: A system design

approach, Wiley (1988).

23. Gautam, J. K., Kumar, A., Saxena, S.C.:

WINDOWS: A tool in signal processing.

IETE Tech. Rev., vol. 12(3), 217-226

(1995).

24. Paulo, S. R. Diniz, Eduardo A. B. da Silva

and Sergio L. Netto.: Digital signal

processing: System, analysis and design,

Cambridge University Press (2003).

25. Hooke, R., Jeaves, T.: Direct search solution

of numerical and statistical problems, J.

Assoc. Comp. Machines, 8, 212–229 (1961).

26. Jain, A., Saxena, R., Saxena, S.C.: An

improved and simplified design of cosine

modulated pseudo-QMF filter banks. Digit.

Signal Process. 16(3), 225–232 (2006).

7

Annexure 1

Flowchart for gradient based optimization technique

Specify desired stop-band and pass-band ripple

Yes

No

No

Yes

step =step/2

dir=-dir

Stop

Is

|error| >|m-error|

Design prototype filter and determine reconstruction error

(|error|)

|prev-error| |error|

ωc*= ωc+(step ×dir)

Redesign prototype filter using ωc* and determine reconstruction

error (|error|)

Is |error| ≤|m-error|

or

|prev-error| =|error|

Initialize: pass-band and stop-band frequencies. m- error, step,

dir, Cut-off frequency(ωc), Filter order, and Window coefficients

1

Shivaji Sinha, Member IETE, Rachna Bhati, Dinesh Chandra, Member IEEE & IETE

email:[email protected], [email protected], [email protected]

Department of Electronics & Communication Engineering, JSSATE Noida

Abstract — A very important aspect in OFDM is time

and frequency synchronization. In particular, frequency

synchronization is the basis of the orthogonality between

frequencies. Loss in frequency synchronization is caused

due to Doppler shift because of large number of

frequencies closely spaced next to each other in OFDM

frame. So the intersymbol interference (ISI) and Inter

Carrier Interference(ICI) are also produced. This paper

presents the effects of frequency offset error in OFDM

system introduced by the fading sensitive channel.

Performance of the OFDM system is evaluated using r.m.s.

value of error across all subcarriers for different values of

the subcarrier spacing, SNR degradation and received

signal constellation in Matlab environment. The

performance is compared under various conditions of

noise variance and frequency Offset.

Index Terms— Cyclic Prefix, FFT, Frequency Offset,

ICI, IFFT , OFDM, SNR

I. INTRODUCTION

High data rate transmission is one of the major

challenges in modern communications. OFDM which is

seen as the future technology for the wireless local area

systems and used as part of the IEEE 802.11a standard,

provide high data rate transmission [1]. The need for

OFDM (Orthogonal Frequency Division Multiplexing)

system came from the idea of efficient use of spectrum

as well as bandwidth where the data transmission

becomes four times faster than the present one. OFDM

supports the technologies like DAB (Digital Audio

Broadcasting) or DVB (Digital Video broadcasting). It

is a special case of multicarrier transmission, where a

single data stream is transmitted over a number of lower

rate subcarriers. All the subcarriers within the OFDM

signal are time and frequency synchronized to each

other, allowing the interference between subcarriers to

be carefully controlled [2] [3]. In systems based on the

IEEE 802.11a standard, the Doppler effects are

negligible when compared to the frequency spacing of

more than 300 kHz. What is more important in this

situation is the frequency error caused by imperfections

in oscillators at the modulator and the demodulator.

These frequency errors cause a frequency offset

comparable to the frequency spacing, thus lowering the

overall SNR [3].

II. OFDM SYSTEM IMPLEMENTATION

In OFDM, a frequency-selective channel is subdivided

into narrower flat fading channels. Although the

frequency responses of the channels overlap with each

other as shown in Figure 1, the impulse responses are

orthogonal at the carriers, because the nulls of the each

impulse response coincides with the maximum values of

another impulse response and thus the channels can be

separated [3].

Fig 1. Orthogonality Principle

In OFDM the data are transmitted in blocks of length .

The N. The Nth data block {Xn[0],…….Xn[N-1]} is

transformed into the signal block {xn[0],…xn [N-1]} by

the IFFT as given by

.....(1)

Each frequency 2πk/N , k=,0,..., N-1 represents a

carrier.

A basic OFDM implementation scheme is

shown in Figure 2. Data at each sub-carrier (Xm) are

input into the inverse fast Fourier transform (IFFT) to

be converted to time-domain data (xm) and after parallel

to serial conversion (P/S), a cyclic prefix is added to

Performance Analysis of Sub Carrier Spacing Offset in

Orthogonal Frequency Division Multiplexing System

2

prevent ISI. At the receiver, the cyclic prefix is re-

moved, because it contains no information symbols.

After the serial-to-parallel (S/P) conversion, the

received data in the time domain (ym) are converted to

the frequency domain (Ym) using the fast Fourier

transform (FFT) algorithm.

Fig 2. OFDM System

III. FREQUENCY OFFSET & FREQUNCY

SYNCHRONIZATION ALGORITHM

The first source of frequency Offset is relative motion

between transmitter & receiver (Doppler or Frequency

drift) and is given by

……..(2)

Where fc is carrier frequency & v is relative velocity

between Transmitter & Reciver. While second source is

frequency errors in oscillator.. Single-carrier systems

are more sensitive to timing offset errors while OFDM

generally exhibits good performance in the presence of

timing errors. In practice, the frequency, which is the

time derivative of the phase, is never perfectly constant,

thereby causing ICI in OFDM receivers. One of the

destructive effects of frequency offset is loss of

orthogonality. The loss of orthogonality causes the ICI

as shown in Figure 3.

Fig 3. ICI in OFDM

The areas, colored with yellow, show the ICI. When the

centers of adjacent subcarriers are shifted because of the

frequency offset, the adjacent subcarriers nulls are also

shifted from the center of the other subcarrier. The

received signal contains samples from this shifted

subcarrier, leading to ICI [6]. The destructive effects of

the frequency offset can be corrected by estimating the

frequency offset itself and applying proper correction.

This calls for the development of a frequency

synchronization algorithm. Three types of algorithms

are used for frequency synchronization: algorithms that

use pilot tones for estimation (data-aided), algorithms

that process the data at the receiver (blind), and

algorithms that use the cyclic prefix for estimation

[4 ][5].

Among these algorithms, blind techniques are

attractive because they do not waste bandwidth to

transmit pilot tones . However, they use less information

at the expense of added complexity and degraded

performance [6]. The degradation of the SNR, Dfreq,

caused by the frequency offset, is approximated as

..….(3)

Where is the frequency offset, T is the symbol

duration in seconds , Eb is the energy per bit of the

OFDM signal and N0 is the one-sided noise power

spectrum density (PSD) [6][7] .

IV. SIMULATION PARAMETERS

First we have analyzed the impact of frequency offset

resulting in Inter Carrier Interference (ICI) while

receiving an OFDM modulated symbol. The analysis is

accompanied by Matlab simulation.

TABLE 1

R.M.S. ERROR RELATED PARAMETERS

PARAMETERS VALUES

FFT Size 64

No. of Data Subcarriers 52

No. of bits per OFDM

symbol 52

No. of symbols 1

Modulation Scheme BPSK

3

We have generated an OFDM symbol with all

subcarriers BPSK modulated then added frequency

offset with Gaussian noise of unit variance & zero mean

to result in Eb/N0 = 30 dB. We have find the difference

between the desired and actual constellation and

compute the r.m.s. value of error across all subcarriers.

This is repeated for different values of frequency offset.

The parameters are listed in Table 1.

The Parameters taken for the SNR degradation

calculation & received signal calculation are listed in

Table 2.

TABLE 2

OFDM TIMING RELATED PARAMETERS

PARAMETERS VALUES

No of Subcarriers 48

No. of Pilot Carriers 4

Total number of subcarriers

52

Subcarrier frequency spacing

0.3125 MHz

IFFT/FFT period

3.2(1/ ) s

Preamble duration

16 s

Signal duration BPSK_OFDM

symbol

4(TGI+TFFT) s

Guard interval (GI) duration

0.8(TFFT /4) s

Modulation Scheme QPSK

V. RESULTS ANALYSIS

In Figure 4 and 5 we have calculated the SNR loss for

different values of subcarrier spacing. we have seen that

the simulated results are slightly better than theoretical

results because the simulated results are computed using

average error for all subcarriers (and the subcarriers at

the edge undergo lower distortion. From figure 5 for

Eb/N0 = 30 db, the theoretical & simulated results are

overlapped at zero frequency Offset for -30 db r.m.s.

error.

-0.6 -0.4 -0.2 0 0.2 0.4 0.6-50

-40

-30

-20

-10

0

10

freqency offset/subcarrier spacing

Err

or,

dB

Error magnitude with frequency offset

theory at Eb/No=20 db

simulation at Eb/No=20 db

Fig 4. Energy Magnitude with frequency Offset at Eb/No=20db

-0.6 -0.4 -0.2 0 0.2 0.4 0.6-50

-40

-30

-20

-10

0

10

freqency offset/subcarrier spacing

Err

or,

dB

Error magnitude with frequency offset

theory at Eb/No=30 db

simulation at Eb/No=30 db

Fig 5. Energy Magnitude with frequency Offset at Eb/No=30db

Figure 6 shows the calculated degradation of the SNR

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.510

-15

10-14

10-13

10-12

10-11

10-10

10-9

10-8

10-7

frequency offset in percent

SN

R d

egra

dation (

Dfr

eq)

in d

B

17 db

15 db

10 db

5 db

Fig 6. SNR degradation of frequency offset for different Eb/N0

values

4

due to the frequency offset. For smaller SNR values, the

degradation is less than for bigger SNR values as shown

in Figure 6. In order to study the SNR degradation in

OFDM systems we have examined the received signal

with no frequency offset. In this case, the data were sent

by two of the carriers. We have generated 512 random

QPSK signals as data. We send data using only two of

the subcarriers, and the other subcarriers have no data.

Figure 7 shows that for no frequency offset & noise

variance (ideal condition), there is no ICI and no

interference between the data and the other zeros

-1.5 -1 -0.5 0 0.5 1 1.5

-1.5

-1

-0.5

0

0.5

1

1.5

real

imagin

el

Fig 7. . Received signal constellation with 0% frequency offset

-1.5 -1 -0.5 0 0.5 1 1.5

-1.5

-1

-0.5

0

0.5

1

1.5

real

imagin

el

Fig 8. Received signal constellation with 0.3% frequency offset

When 0.3% frequency offset & 0.002 noise variance is

introduced in the carrier, its effects are observed in

terms of ICI. The result with 0.3% frequency offset is

shown in Figure 8. In particular, we can see that the

signal from neighbouring carriers causes interference

and we have a distorted signal constellation at the

receiver.

When compared to Figure 8 in figure 9, it can be seen

that the received signal with 0.5% frequency offset

value for the same 0.002 noise variance is more

distorted than the received signal with 0.3% frequency

offset.

The simulation results reveal that the distortion in the

received signal is increased. which is set to zero as

shown in figure 9 & figure 10. The effects of the

frequency offset can also be observed when, data are

sent with every subcarrier, except one.

-1.5 -1 -0.5 0 0.5 1 1.5

-1.5

-1

-0.5

0

0.5

1

1.5

real

imagin

el

Fig 9. Received signal constellation with 0.5% frequency offset

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Real axis

Imagin

el axis

Fig 10. Received signal at the zero subcarrier with no

frequency offset

If we have the frequency offset in the channel, we

cannot receive a zero (no data) at the subcarrier that was

set to zero. Figure 10 shows the effect of ICI due to no

frequency offset on the subcarrier with zero data from

all other subcarriers. In the ideal case of no frequency

offset, the demodulated value should be zero for the

whole time. When frequency offset is present, the effect

is like random noise which increases with the frequency

5

offset. As shown in Figure 11, the effect of ICI

increases considerably when the frequency offset is on

the order of 0.4% - 0.6%.

When we compare the results in Figures 10 and 11, we

can see that when we increase the frequency offset

value, the received signal is distorted more and for the

frequency offset values bigger than 0.6%, the received

data are unreadable

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

Real

Imagin

ary

freq. offset=0.4%

freq.offset=0.6%

Fig 11. Received signal at the zero subcarrier with 0.4% and 0.6%

frequency off-set

VI. CONCLUSION

Simulation results demonstrated the distortive effects of

frequency offset on OFDM signals; frequency offset

affects symbol groups equally. Additionally, it was seen

that an increase in frequency offset resulted in a

corresponding increase in these distortive effects and

caused degradation in the SNR of individual OFDM

symbols.

VII. FUTURE WORK

For the system developed above we can implement

three methods for frequency offset estimation: data-

driven, blind and semi-blind. The data-driven and semi-

blind rely on the repetition of data, while the blind

technique determines the frequency offset from the

QPSK data. The use of preambles & cyclic Prefix in

frequency offset estimation can also be implemented.

VIII. REFERENCES

[1] Md. Amir Ali Hasan, Faiza Nabita, Imtiaz Ahmed Amith

Khandakar “Analytical Evaluation of Timing Offset error in OFDM

system” in 2010 Second International Conference on Communication

Software and Networks.

[2] Richard Van Nee and Ramjee Prasad, OFDM for Wireless

Multimedia Communications, The Artech House Universal Personal

Communications, Norwood, MA, 2000

[3] A. Y. Erdogan, “Analysis of the Effects of Frequency Offset in

OFDM Systems,” Master’s Thesis, Naval Postgraduate School,

Monterey, California, 2004.

[4] B. Mcnair, L.J. Cimini and N. Sollenerber, “A Robust and

Frequency Offset Estimation Scheme for OFDM Systems,” AT&T

Labs- Research, New Jersey,2000

[5] Jan-Jaap van de Beek, Magnus Sandelland Per Ola B.rjesson, “ML

Estimation of Time and Frequency Offset in OFDM Systems,” In

IEEE Transactions on Signal Processing, vol. 45, no. 7, pp. 1800-

1805, July 1997.

[6] P.H. Moose, “A technique for orthogonal frequency division

multiplexing frequency-offset correction,” IEEE Trans. on Commun.,

vol. 42, no. 10, pp. 2908-2914, Oct. 1994.

[7] Ersoy Oz, “A Comparison of Timing Methods In Orthogonal

Frequency Division Multiplexing (OFDM) Systems,” Master’s thesis,

Naval Postgraduate School, Mon-terey, California, 2004.

IX. AUTHOR’S BIOGRAPHY

1.Shivaji Sinha is Asst. Prof. in J.S.S. Academy of Technical

Education, Noida since Oct. 2003. He is member of IETE. He has

done his B. Tech from G.B. Pant Engg. College Pauri, Garhwal in

Electronics & Communication Engineering & M. Tech in VLSI

design from U.P. Technical University.

2. Rachan Bhati is a student of B. Tech Final Year in JSS Aademy of

Technical Education.

3. Dinesh Chandra is Head & Professor in deptt. of Electronics &

Communication Engineering, J.S.S Academy of Technical Education,

Noida since April 2001.He is Fellow Member of IETE & Member of

IEEE. He has done his B. Tech form University of Roorkee (I.I.T.

Roorkee) in Electrical Engineering & M. Tech From I.I.T. Kharagpur

in Microwave & Optical Communication Engineering in 1987.He is

also Coordinator of M. Tech Program of G. B. technical University &

Member of Board of Studies (BOS) G.B. Technical University for

revision of Syllabus for Electronics & Communication and

Instrumentation & Control Engineering


2011

SIP0301-1

A Comparative Analysis of ECG Data Compression Techniques Sugandha Agarwal

Amity School of Engineering and Technology

Amity University Uttar Pradesh,

Lucknow [email protected]

Abstract Computerized electrocardiogram (ECG),

electroencephalogram (EEG), and magneto encephalogram

(MEG) processing systems have been widely used in

clinical practice and they are capable of recording and

processing long records of biomedical signals. The need

of sending electrocardiogram records over telephone lines

for remote analysis is increasing, and so the need for

effective electrocardiogram compression techniques is

great. The aim of any biomedical signal compression

scheme is to minimize the storage space without losing any

clinically significant information, which can be achieved

by eliminating redundancies in the signal, in a reasonable

manner. The algorithms that produce better compression

ratios and less loss of data are needed. Various data

compression techniques have been proposed for reducing

the digital ECG volume for storage and transmission. Due

to the diverse procedures that have been employed,

comparison of the ECG compression method is a major

problem. The main purpose of this paper is to address

various ECG compression algorithms and determine which

would be more efficient. ECG data compression

techniques are broadly divided into two major groups:

Direct data compression and transformation methods.

Direct data reduction techniques are: Turning point,

AZTEC, CORTES, DPCM and entropy coding, fan and

SAPA, peak-picking and cycle-to-cycle compression. The

transformation method include: Fourier, cosine and K-L

transform. The paper concludes with the comparison of

some important data compression techniques. Comparison

of various ECG compression techniques like TURNING

POINT, AZTEC, CORTES, FFT and DCT it was found

that DCT is the best suitable compression technique with

compression ratio of about 100:1.

Keywords: ECG Compression;

I. INTRODUCTION

An electrocardiogram (ECG or EKG) is a graphic

representation of the heart's electrical activity, formed as

the cardiac cells depolarize and repolarise. Electrical

impulses in the heart originate in the sinoatrial node and

travel through the heart muscle where they impart

electrical initiation of systole or contraction of the heart.

The electrical waves can be measured at selectively placed

electrodes (electrical contacts) on the skin. Electrodes on

different sides of the heart measure the activity of different

parts of the heart muscle. An ECG displays the voltage

between pairs of these electrodes, and the muscle activity

that they measure, from different directions.

A typical ECG cycle is defined by the various features (P,

Q, R, S, and T) of the electrical wave. As shown in figure

1. The P wave marks the activation of the atria, which are

the chambers of the heart that receive blood from the body.

Next in the ECG cycle comes the QRS complex. The QRS

complex represents the activation of the left ventricle,

which sends oxygen-rich blood to the body, and the right

ventricle, which sends oxygen-deficient blood to the lungs.

During the QRS complex, which lasts about 80 msec, the

atria prepare for the next beat, and the ventricles relax in

the long T wave [1,2]. It is these features of the ECG

signal by which a cardiologist uses to analyze the health of

the heart and note various disorders.

Figure1. A typical representation of the ECG waves.

Digital analysis of electrocardiogram (ECG) signal

imposes a practical requirement that digitized data be

selectively compressed to minimize analysis efforts

and data storage space. Therefore, it is desirable to

carry out data reduction or data compression. The

main goal of any compression technique is to achieve

maximum data volume reduction while preserving the

significant signal features upon reconstruction.

Conceptually, data compression is the process of

detecting and eliminating redundancies in a given set.

Shannon has defined redundancy as “that fraction of

message or datum which is unnecessary and hence

repetitive in the sense that if it was missing the

message would still be essentially complete or at least

could be completed. ECG data compression is broadly

classified into two major groups: Direct data

compression and transformation method. The direct

data compressions base their detection of

redundancies on direct analysis of the actual signal

sample. Whereas transformation method utilize

spectral and energy distribution analysis for detecting

redundancies [2,7]. Data compression is achieved by

discarding digitized samples that are not important for

subsequent pattern analysis and rhythm interpretation.

Examples of such data compression algorithms are:

AZTEC, turning point (TP). AZTEC retains only the

samples for which there is sufficient amplitude

change. TP retains points where the signal curves


2011

SIP0301-2

(such as at the QRS peak) and discards every alternate

sample [6]. The data reduction algorithms are

empirically designed to achieve good reduction

without causing significant distortion error.

II. SYSTEM DESCRIPTION

The acquired signal is taken and is fed to an

instrumentation amplifier that amplifies the signal. The

amplifier is used to set the gain and it also amplifies very

low amplitude ECG signal into perceptible view. The

acquisition of pure ECG signal is of higher importance. As

we know that the ECG signal will be in the range of milli-

volts, which is difficult to analyze. So the prior

requirement is to amplify the acquired signal. The

amplified output is then fed to the analog to digital

converter for digitalizing the ECG data using ADC and

microcontroller. In this process, micro-controller is used so

as to set the clocks for picking up the summation of the

signals that are generated from the heart. The heart

generates different signals at various nodes [3]. The

summation of the signals that are generated by the heart is

taken and then it is sent for filtering processes. Then the

digital output of the ECG is displayed in LCD.

Figure2. Basic block diagram of the ECG module.

After the filtering process, the signal is set for the

transmission, but it is important to compress it so as to

transmit at a faster rate. As shown in figure 2.

III.COMPRESSION TECHNIQUES

Data compression techniques are categorized as those in

which the compressed data is reconstructed to form the

original signal and techniques in which higher compression

ratios can be achieved by introducing some error in the

reconstructed signal. The effectiveness of an ECG

compression technique is described in terms of

compression ratio (CR), a ratio of the size of the

compressed data to the original data; execution time, the

computer processing time required for compression and

reconstruction of ECG data; and a measure of error loss,

often measured as the percent mean-square difference (PRD) [5]. The PRD is calculated as follows:

where ORG is the original signal and REC is the

reconstructed signal. The lower the PRD, the closer the

reconstructed signal is to the original ECG data [10,11].

The various compression techniques like AZTEC, TP,

CORTES, DFT, FFT algorithms are compared with PRD

and Compression ratio and best suitable was considered.

The amplitude zone time epoch coding algorithm

(AZTEC) converts the original ECG data into horizontal

lines (plateaus) and slopes [4]. Slopes are formed when the

length of a plateau is less than three. The information

saved from a slope is the length of the slope and its final

amplitude. The turning point technique (TP) always

produces a 2:1 compression ratio. It accomplishes this by

replacing every three data points with the two that best

represent the slope of the original three points. The

coordinate reduction time encoding system (CORTES)

combines the high compression ratios of the AZTEC

system and the high accuracy of the TP algorithm.

A. Direct Data compression Method

1. Turning Point Algorithm

1) Acquire the ECG signal

2) Take the first three samples and check for the condition

as mentioned below:

(x1-x0)*(x2-x1)<0

(or)

(x1-x0)*(x2-x1)>0

3) If the above condition-1 is correct then x1 is stored

else x2 is stored.

4) Reconstructing the compressed signal.

The compression ratio of Turning point algorithm is 2:1, if

higher compression is required then the same algorithm

can be implemented on the already compressed signal so

that it is further compressed to a ratio of 4:1. But after the

2nd compression, the required data in the signal may be

lost since the signal is overlapped on one another.

Therefore, TP algorithm is limited to compression ratio of

2:1. TP algorithm can be applied on the already

compressed data to increase the compression ratio to 4:1

[7, 13]. As shown in figure 3, The Turning Point is

basically an adaptive down sampling method developed

especially for ECGs. It reduces the sampling frequency of

an ECG signal by a factor of two.


2011

SIP0301-3

Figure3. Turning point compression analysis.

2. AZTEC ALGORITHM

Another commonly used technique is known as AZTEC

(Amplitude Zone Time Epoch Coding). This converts the

ECG waveform into plateaus (flat line segments) and

sloping lines. As there may be two consecutive plateaus at

different heights, the reconstructed waveform shows

discontinuities. Even though the AZTEC provides a high

data reduction ratio, the fidelity of the reconstructed signal

is not acceptable to the cardiologist because of the

discontinuity (step-like quantization) that occurs in the

reconstructed ECG waveform [12,13]. As shown in Figure

4. AZTEC Algorithm is implemented in 2 phases:

2.1. Horizontal Mode 1) Acquire the ECG signal

2) Assign the first sample to Xmax and Xmin which

represents highest and lowest elevations of the current line.

3) Check for the following condition and store the plateau

if a) If X1>Xmax then Xmax =X1 and

b) If X1<Xmin then Xmin =X1 and so on till Xn

samples, repeat this until the following 2 conditions are

satisfied, the difference between VMAX and VMIN is

greater than a predetermined threshold or if line length is

> 50 are satisfied

4) The stored values are the length L=S-1, where S is no.

of samples and L is length and the average amplitude of

the plateau (VMAX+VMIN)/2.

5) Algorithm starts assigning the next samples to Xmax

and Xmin.

2.2. Slope Mode 1) If no. of samples <=3, then the line parameters are not

saved. Instead the algorithm begins to produce slopes.

2) The direction of the slope is determined by check-ing

the following conditions.

a) If (X2 - X1) * (X1 - X0) is +ve then the slope is +ve.

b) If (X2 - X1) * (X1 - X0) is -ve then the slope is -ve.

3) The slope is terminated if the no. of samples is >=3 and

if direction of slope is changed.

Figure 4. AZTEC compression analysis.

3. CORTES Algorithm

An enhanced method known as CORTES (Coordinate

Reduction Time Encoding System) applies TP to some

portions of the waveform and AZTEC to other portions

and does not suffer from discontinuities. AZTEC line

length threshold Lth, CORTES saves the AZTEC line

otherwise it saves the TP data. As shown in Figure 5.

1) Acquire the ECG signal

2) Define the Vth and Lth.

3) Find the current Maximum and minimum.

4) If the Sample greater than threshold than compare the

length with Lth

5) If (len>lth)

AZTEC Else

TP

6) Plot the compressed signal.


2011

SIP0301-4

Figure 5. CORTES compression analysis.

B. Transformation Methods

1. FFT Compression

1) Separate the ECG components into three components x,

y, z.

2) Find the frequency and time between two samples.

3) Find the FFT of ECG signal check for fft coeffi-cients

(before compression) =0, increment the counter A if it is

between +25 to-25 and assign to Index=0.

4) Check for FFT coefficients (after compression) =0,

increment the Counter B.

5) Calculate inverse FFT and plot decompression, error.

6) Calculate the compression ratio, PRD.As shown in

Figure 6.

Figure 6. FFT compression analysis.

2. DCT Compression

1) Separate the ECG components into three components x,

y, z.

2) Find the frequency and time between two samples.

3) Find the dct of ECG signal check for dct coefficients

(before compression) =0, increment the counter A if it is

between +0.22 to -0.22 and assign to Index=0.

4) Check for DCT coefficients(after compression)=0,

increment the Counter B.

5) Calculate inverse dct and plot decompression, error.

6) Calculate the compression ratio, PRD. As shown in

Fig.7

Figure 7. DCT compression analysis.

IV SUMMARRY

Summary of ECG data compression schemes. The

comparison table shown in Table 1, details the resultant

compression techniques. This gives the choice to select the

best suitable compression method. From the table we

conclude that the DCT with the compression ratio of 90.43

and PRD of 0.93.Is the most efficient algorithm for ECG

data compression.

Table 1. Comparison of compression techniques.

METHOD COMPRESION

RATIO

PRD

CORTES 4.8 3.75

TURNING POINT 5 3.20

AZTEC 10.37 2.42

FFT 89.57 1.16

DCT 90.43 0.93

Graph showing compression ratio and PRD


2011

SIP0301-5

CONCLUSION

Compression techniques have been around for many years.

However, there is still a continual need for the

advancement of algorithms adapted for ECG data

compression. The necessity of better ECG data

compression methods is even greater today than just a few

years ago for several reasons. The quantity of ECG records

is increasing by the millions each year, and previous

records cannot be deleted since one of the most important

uses of ECG data is in the comparison of records obtained

over a long range period of time. The ECG data

compression techniques are limited to the amount of time

required for compression and reconstruction, the noise

embedded in the raw ECG signal, and the need for accurate

reconstruction of the P, Q, R, S, and T waves.

From this paper author try to unify various data

compression techniques, used for ECG data compression

[8,9]. The results of the research will likely provide an

improvement on existing compression techniques.

REFERENCES

[1] Held, Gilbert., 1987, Data Compression : Techniques

and Applications Hardware and Software

Considerations, John Wiley & Sons Ltd.

[2] Lynch, Thomas J., 1985, Data Compression :

Techniques and Applications, Van Nostrand Reinhold

Company

[3] D. C. Reddy, (2007) biomedical signal processing-

principles and techniques, 254-300, Tata McGraw-Hill,

Third reprint.

[4] P. Abenstein and W. J. Tompkins, “New Data

Reduction Algorithm for Real-Time ECG Analysis,”

[5] Al-Nashash, H. A. M., 1994, "ECG data compression

using adaptive Fourier coefficients estimation", Med. Eng.

Phys., Vol. 16, pp. 62-67

[6] B. R. S. Reddy and I. S. N. Murthy, (1986) ECG data

compression using Fourier descriptors, IEEE Trans. Bio-

med. Eng., BME-33, 428-433.

[7] V. Kumar, S. C. Saxena, and V. K. Giri, (2006) Direct

data compression of ECG signal for telemedicine, ICSS ,

10, 45-63.

[8] Jalaleddine, C. Hutchens, R. Stratan, and W. A. Co-

berly, (1990) ECG data compression techniques-a unified

approach, IEEE Trans. Biomed. Eng., 37, 329-343.

[9] Trans. Biomed. Eng., 15, 128–129, 1968.

[10] Hamilton, Patrick S., 1991, "Compression of the

Ambulatory ECG by Average Beat Subtraction and

Residual Differencing", IEEE Transactions on Biomedical

Engineering, Vol. 38, No. 3., pp. 253-259

[11] Grauer, Ken., 1992, A Practical Guide to ECG

Interpretation, MosbyYear Book, Inc.

[12] J. R. Cox, F. M. Nolle, H. A. Fozzard, and G. C.

Oliver, (1968) AZTEC, a pre-processing program for real

time ECG rhythm analysis, IEEE Trans. Biomed. Eng.,

BME-15, 129-129.

[13] J. L. Simmlow, Bio signal and biomedical image

processing- MATLAB based applications, 4-29.

[14] N. S. Jayant, P. Noll, Digital Coding of Waveforms,

Englewood Cliffs, NJ, Prentice-Hall, 1984.


SIP0303-1

Biologically inspired Cryptanalysis- A Review

Ashutosh Mishra*, Dr. Harsh Vikram Singh**, S.P. Gangwar**

*Student (M.Tech), KNIT SULTANPUR, ** Astt. Prof. Dept. of Electronics Engineering, KNIT SULTANPUR

Abstract:- Data security to ensure authorized access of

information and fast delivery to a variety of end users with

guaranteed Quality of Services (QoS) are important topics of

current relevance. In data security, cryptology is introduced to

guarantee the safety of data, whereby it is divided into

cryptography and cryptanalysis. Cryptography is a technique

to conceal information by means of encryption and decryption

while cryptanalysis is used to break the encrypted information

using some methods. Biological Inspired techniques (BIT) are

a method that takes ideas from biology to be used in

cryptography. BIT is a field that has been widely used in many

computer applications such as pattern recognition, computer

and network security and optimization. Some examples of BIT

approaches are genetic algorithm (GA), ant colony and

artificial neural network (ANN). GA and ant colony have been

successfully applied in cryptanalysis of classical ciphers.

Therefore, this paper will review these techniques and explore

the potential of using BIT in cryptanalysis.

Keywords: Cryptanalysis, Genetic Algorithm, Artificial

neural network, Ant Colony.

1 Introduction

There are many cryptographic algorithms (cipher) that have

been developed for information security purposes such as the

Data Encryption Standard (DES), Advanced Encryption

Standard (AES) and Rivest-Shamir-Adleman (RSA). These

are some examples of a modern cipher. The foundation of the

algorithms, especially block ciphers, is mainly based on the

concepts of a classical cipher such as substitution and

transposition. For instance, DES uses only three simple

operator namely substitution, permutation (transposition) and

bit-wise exclusive-OR (XOR) [2]. BIT is a field that has

caught the interest of many researchers. The ability of using

BIT approaches in various fields has been proven. Clark [6]

hopes for those who do research in BIT especially related to

ants, swarm and Artificial Neural Network, to examine the

application of those techniques in cryptology. He also states

that a good place to start is on classical cipher cryptanalysis or

Boolean function design. This paper is organized as follows:

first, we review simple substitution cipher, columnar

transposition cipher and permutation cipher which are types of

classical cipher, in Section 2. In Section 3, some biological

inspired techniques employed are explained and the use of

these approaches in cryptography is reviewed in Section 4.

Finally, conclusions are given in Section 5.

2 Classical Ciphers

Classical ciphers are often divided into substitution ciphers

and transposition ciphers. There are many types of these

ciphers. In this paper, we focus on simple substitution cipher

and two types of transposition cipher namely columnar

transposition cipher and permutation cipher. The ciphers are

vulnerable to cipher text-only attacks by using frequency

analysis.

Basically, a simple substitution cipher is a technique of

replacing each character with another character. The mapping

function of replacing the characters is represented by the key

used. For this purpose of study, white spaces are ignored while

other special characters like comma and apostrophe are

removed. Example 1 shows a simple substitution cipher:

Alphabet: A B C D E F G H I J K L M N O P Q R S T U V

W X Y Z


SIP0303-2

Key: M N F Q Y A J G R Z K B H S L C I V U D O W T E P

X

Example 1

Plain Text: - KAMLA NEHRU INSTITUTE OF

TECHNOLOGY

Cipher text: - KMHBM SYGVO RSUDRDODY LA

DYFGSLBLJP

The idea of a transposition cipher is to alter the position of a

character to another position. In columnar transposition cipher,

the plaintext is written into a table of fixed number of

columns. The number of columns depends on the length of the

key. The key represents the order of columns that will become

the cipher text. We only consider 26 characters in the

alphabet, so all special characters are removed. For example,

the plaintext “KAMLA NEHRU INSTITUTE OF

TECHNOLOGY” with the key “4726135” is transformed to

cipher text by inserting it into a table as shown in the example

in Example 2.

4 7 2 6 1 3 5

K A M L A N E

H R U I N S T

I T U T E O F

T E C H N O L

O G Y P Q R S

Example 2

Four dummy alphabets (here, P, Q, R and S) are added for

complete the rectangle and the cipher text can be written in

group of five characters [4]. So the cipher text of this cipher is

“KHITO ARTEG MUUCY LITHP ANENQ NSOOR

ETFLS”.

The permutation cipher operates by rearranging each character

in a plaintext block by block based on a key. The size of the

block is the same as the length of the key and the cipher text

can also be written in group of five characters. Using the same

plaintext and key of the previous example, the cipher text of

the permutation cipher is produced as depicted in example 3 as

follows:

Key: - plain text order: - 1 2 3 4 5 6 7

Cipher text order: - 4 7 2 6 1 3 5

Order: - 1234567 1234567 1234567 1234567 1234567

Example 3

Plain text: KAMLANE HRUINST ITUTEOF TECHNOL

GYPQRSX

Cipher Text: - LEANK MAITR SHUNT FTOIU EHLEO

TCNQX YSGPR (P, O, R, S, X, are dummy variable)

In both simple substitution and transposition cipher, there are

same disadvantage which regards to the frequency of

characters. Based on the Example 1, the character K is

replaced with K, A with M and so forth. Therefore, the

frequency of each character in the plaintext will be exactly the

same as the frequency of its corresponding cipher text

character. Hence, the encryption algorithm preserves the

frequency of characters of the plaintext in the cipher text

because it merely replaces one character with another. Still,

the frequency of characters depends on the length of the text

and probably, some characters are not even used in plaintext.

As shown in the above example, the character P, Q and R are

some characters that do not exist in the plaintext. Therefore,

many researchers use frequency analysis for cryptanalysis of

simple substitution cipher. Analyses were done by using

frequency of single character (unigram), double character

(bigram), triple character (trigram) and so on (n-grams). The

technique used to compare candidate keys to the simple

substitution cipher is to compare the frequency of n-grams of

the cipher text with the language of the text. In the effort of

attacking the transposition cipher, the multiple anagramming

attack can be used. The cipher text is written into a table


SIP0303-3

which the number of columns represents the length of the key.

For columnar cipher, the cipher text is written into the table

column by column from left to right while in permutation

cipher, the cipher text is written row by row from top to

bottom. After that, the columns are rearranged to form

readable plaintext in every row.

3 Biological Inspired Techniques

BIT is a method that takes ideas from biology to be used in

computing. It relies heavily on the fields of biology, computer

science and mathematics. Some of BIT approaches are GA,

artificial neural network (ANN), DNA, Cellular Automata, ant

colony, particle swarm optimization and membrane

computing. Four of these techniques namely GA, ant colony

and ANN, Cellular automata describe later in this section.

3.1 Genetic Algorithm

Genetic Algorithm (GA) is a technique that is used to optimize

searching process and was introduced by Holland in 1975 [5].

This algorithm is based on natural selection in the biological

sciences [7]. There are several processes in GA namely

selection, mating and mutation. In the beginning of the cycle,

a set of random population is created as the first generation.

Elements that make up the population are the potential

solution to the problem. The population is represented by

strings. Then, pairs of strings are selected based on a certain

criteria called a fitness function. These pairs are known as

parents and will be mated to produce children. The children

are then mutated based on a mutation rate because not all

children are mutated. After the mutation process, a new set of

population is formed (the next generation). The cycle

continues until some stopping condition is met such as a

maximum number of generations. This algorithm has been

successfully applied in cryptanalysis of classical and modern

ciphers such as simple substitution, polyalphabetic,

transposition, knapsack, rotor machine, RSA and TEA. We

will further explore the usage of this algorithm in

cryptanalysis in Section 4.

3.2 Ant Colony Optimization

Ant colony optimization is inspired by the pheromones trail

laying and following behavior of real ants which use

pheromones as a communication medium. This approach was

proposed for solving hard combinatorial optimization

problems [9]. An important aspect of ant colonies is the

collective action of many ants result in the location of the

shortest path between a food source and a nest. Standard ant

colony optimization (ACO) algorithm contains probabilistic

transition rule, goodness evolution and pheromone updating

[6]. In cryptanalysis, ACO algorithm has been applied in

breaking transposition cipher and block cipher. Cryptanalysis

of transposition cipher published in [6] is reviewed in Section

4 of this paper.

3.3 Artificial Neural Network

Artificial Neural Networks (ANN) can be defined as

computational systems inspired by theoretical immunology,

observed immune functions, principles and mechanisms in

order to solve problems [8]. ANN can be divided to

population-based algorithm such as negative selection and

clonal selection algorithm and network-based algorithm such

as continuous and discrete immune networks. ANN has been

applied to a wide variety of application areas such as pattern

recognition and classification, optimization, data analysis,

computer security and robotic [8]. Hart and Timmis et. l.

categorized these application areas and some others into three

major categories namely learning, anomaly detection and

optimization. In optimization, most of the papers published are

based on the application of clonal selection principle using the

algorithm such as Clonalg, opt-AINET and B-cell algorithm.

De Castro & Von Zuben [8] proposed a computational

implementation of the clonal selection algorithm (it is now


SIP0303-4

called Clonalg). The authors compared their algorithm‟s

performance with GA for multi-modal optimization and argue

that their algorithm was capable of detecting a high number of

sub-optimal solutions, including the global optimum of the

function being optimized. Castro [8] extended this work by

using immune network metaphor for multi-modal

optimization. Clonal selection has also been used in

optimization of dynamic functions. The result is compared

with evolution strategies (ES) algorithm. The comparison is

based on time and performance and shows that clonal

selection is better than ES in small dimension problems.

However, in higher dimension, ES outperformed the clonal

selection in time and performance. Other than that, somr

author applied the Clonalg in a scheduling problem, with the

name clonal selection algorithm for examination timetabling

(CSAET). The research shows that CSAET is successful in

solving problems related to scheduling. From the comparison

performed between CSAET with GA and memetic algorithm,

CSAET produced quality output as good as those algorithms.

Therefore, literature shows that ANN is capable of producing

good results in various fields especially regarding

optimization. It is hoped that ANN will also find its way in

cryptanalysis.

3.4 Cellular Automata

A cellular automaton is a decentralized computing model

providing an excellent platform for performing complex

computation with the help of only local information. Nandi et

al. presented an elegant low cost scheme for CA based cipher

system design. Both block ciphering and stream ciphering

strategies designed with programmable cellular automata

(PCA) have been reported. Recently, an improved version of

the cipher system has been proposed.

4 BIT in cryptanalysis

Classical cipher was successfully attacked using various

metaheuristic techniques. Metaheuristic is a heuristic method

for solving a very general class of computational problems.

Therefore, this technique is commonly used in combinatorial

optimization problems. Some of metaheuristic techniques that

were successfully applied in the cryptanalysis of classical

cipher are genetic algorithm, simulated annealing, tabu search

, ant colony optimization and hill climbing. In this paper, we

will review BIT techniques that have been successfully

applied in cryptanalysis of classical ciphers (simple

substitution and transposition cipher). Spillman et al have

published their paper on the cryptanalysis of simple

substitution cipher using genetic algorithm in 1993. The paper

is an early work done by using GA in cryptanalysis and it is a

good choice for re-implementation and comparison [4]. In [4],

the authors review some idea about genetic algorithm before

they show the steps on how the algorithm is applied in the

cryptanalysis. The aim of the attack is to find the possible key

values based on frequency of characters in the cipher text. The

key is sorted from the most frequent to the least frequent

characters in the English language. In the selection process,

pairs of keys (parents) are randomly selected from the

population (contains a set of keys that is randomly generated

for the first generation) based on fitness function. The fitness

function compares unigram and bigram frequencies characters

in the known language with the corresponding frequencies in

the cipher text. Keys with higher fitness value have more

chance of being selected. Mating is done by combining each

of the pairs of parents to produce a pair of children. The

children are formed by comparing every element (character) in

each pair of parents. After that, one character in the key can be

change with a randomly selected character based on a

mutation rate in the mutation process. The selection, mating

and mutation processes continue until a stopping criterion is


SIP0303-5

met. Another paper published in 1993 utilizing genetic

algorithm in cryptanalysis was by Matthews. However, the

paper is focuses on transposition cipher. The attack is known

as GENALYST. The attack finds the correct key length and

correct permutation of the key of a transposition cipher.

Matthews uses a list containing ten bigram and trigram yang

that have been given weight values to calculate the fitness. For

instance, the trigram „THE‟ and „AND‟ are given a score of

„+5‟ while „HE‟ and „IN‟ are given a score of „+1‟. Matthews

also give „-5‟ score for the trigram of „EEE‟. This is because,

although „E‟ is very common in English, but a word

containing a sequence of three „E‟s is very uncommon in

normal English text. Higher fitness values have more chance

of being selected. After the selection process has been done,

mating is performed using a position-based crossover method.

Then, the mutation process is applied. There are two possible

mutation types that can be applied. First, randomly swap two

elements and second, shift forward all elements by a random

number of places. The experiment was done by using

population size of 20, 25 generations and crossover decreases

from 8.0 to 0.5. The result shows that GENALYST is

successful in breaking the cipher with key lengths of 7 and 9.

Ant colony optimization has also been successfully

implemented in the cryptanalysis of transposition cipher

published in [8]. The paper uses specific ant algorithm named

Ant Colony System (ACS) with known success on the

Traveling Salesman Problem (TSP), to break the cipher. The

authors used the bigram adjacency score, Adj(I,J) to define the

average probability of the bigram created by juxtaposing

columns I and J. The score will be higher for two correctly

aligned columns. Other than that, they also used dictionary

heuristic, Dict(M) for the recognition of plaintext. The authors

also made a comparison between the results produced by ACS

with the result of previous metaheuristic techniques in

transposition cipher which involves differing heuristics,

processing time and success criteria. The comparison shows

that the ACS algorithm can decrypt cryptograms which are

significantly shorter than other methods due to the use of

dictionary heuristics in addition to bigrams.

5 Conclusion

This paper reviews works on cryptanalysis of classical ciphers

using BIT approaches. The types of classical ciphers involved

are the simple substitution and transposition cipher while GA

and ant colony optimization is the techniques used. GA has

been applied to both ciphers but only transposition cipher was

found to have been implemented using ant colony. ANN is

also discovered to be a promising approach to be employed in

cryptanalysis based on its ability to solve optimization

problems. Therefore, the application of ANN in cryptanalysis

should be further studied,

References

[1] Rsa from wikipedia. http://en.wikipedia.org/wiki/RSA.

[2]. A. Menezes, P. van Oorschot, and S. Vanstone. Handbook

of Applied Cryptography. CRC Press, New York, NY, 1997.

[3] S. Nandi, B. K. Kar, and P. Pal Chaudhuri. Theory and

applications of cellular automata in cryptography. IEEE

Transactions on Computers, 43(12):1346–1357,1994.

4]. Lin, Feng-Tse, & Kao, Cheng-Yan. (1995). A genetic

algorithm for ciphertext-only attack in cryptanalysis. In IEEE

International Conference on Systems, Man and Cybernetics,

1995, (pp. 650-654, vol. 1).

[5]. Holland, J. H. (1975). Adaptation in natural and artificial

systems. Ann Arbor: The University of Michigan Press.

[6]. Clark, J. A. (2003) Invited Paper. Natured- Inspired

Cryptography: Past, Present and Future. IEEE Conference on

Evolutionary Computation 2003. Special Session on

Evolutionary Computation and Computer Security. Canberra.


SIP0303-6

[7]Goldberg, D., (1989) Genetic Algorithms in Search,

Optimization, and Machine Learning. Reading MA: Addison-

Wesley.

[8].de Castro, L. N. (2002) Immune, Swarm and Evolutionary

Algorithms Part I: Basic Models. International Conference on

Neural Information Processing Vol. 3 pp 1464-1468.

[9]. S.N.Sivanandam · S.N.Deepa “Introduction to Genetic

Algorithms, Springer-Verlag Berlin Heidelberg 2008.

[10] Xu Xiangyang, The block cipher for construction of S-

boxes based on particle swarm optimization, 2nd International

Conference on Networking and Digital Society (ICNDS),

2010 , Page(s): 612 - 615

[11] Uddin, M.F.; Youssef, A.M, Cryptanalysis of Simple

Substitution Ciphers Using Particle Swarm Optimization”,

IEEE Congress on Evolutionary Computation, 2006. Page(s):

677 – 680

[12] Mohammad Faisal Uddin; Amr M. Youssef , An

Artificial Life Technique for the Cryptanalysis of Simple

Substitution Ciphers , Canadian Conference on Electrical and

Computer Engineering, 2006, Page(s): 1582 - 1585

[13] Khan, S.; Shahzad, W.; Khan, F.A. , Cryptanalysis of

Four-Rounded DES Using Ant Colony Optimization

,International Conference on Information Science and

Applications (ICISA), 2010 , Page(s): 1 - 7

[14] Ghnaim, W.A.-E.; Ghali, N.I.; Hassanien, A.E., Known-

ciphertext cryptanalysis approach for the Data Encryption

Standard technique, International Conference on Computer

Information Systems and Industrial Management Applications

(CISIM), 2010 , Page(s): 600 - 603

[14] AbdulHalim, M.F.; Attea, B.A.; Hameed, S.M., A binary

Particle Swarm Optimization for attacking knapsacks Cipher

Algorithm ,International Conference on Computer and

Communication Engineering ,2008. Page(s): 77 - 81

[15] Schmidt, T.; Rahnama, H.; Sadeghian, A. , A review of

applications of artificial neural networks in cryptosystems ,

Automation Congress, 2008. WAC 2008. World , Page(s): 1 –

6

[16] Godhavari, T.; Alamelu, N.R.; Soundararajan,

R.,Cryptography Using Neural Network ,INDICON, 2005

Annual IEEE , Page(s): 258 - 261

[17] R. Spillman, M. Janssen, B. Nelson, and M. Kepner. Use

of a genetic algorithm in the cryptanalysis of simple

substitution ciphers. Cryptologia, 1993,17(1):31–44.

[18] Diffie, W. and Hellman, M. (1976). New Directions in

Cryptography. IEEE Transactions on Information Theory,

22(6): 644-654.

[19] Tarek Tadros, Abd El Fatah Hegazy, and Amr Badr

,Genetic Algorithm for DES Cryptanalysis,IJCSNS

International Journal of Computer Science and Network

Security, VOL.10 No.5, May 2010

[20]Forrest, S., Perelson, A. S. Allen, L. and Cherukuri, R.

(1994). Self-nonself Discrimination in A Computer.

Proceedings of IEEE Symposium on Research in Security and

Privacy, Los Alamos, CA. IEEE Computer Society Press.

[21] Stallings, W. (2003). Cryptography and Network

Security: Principles and Practices, 3rd

Edition. Upper Saddle

River, New Jersey: Prentice Hall.

[22] Spillman, R. (1993). Cryptanalysis of Knapsack Ciphers

Using Genetic Algorithms. Cryptologia, XVII(4):367-377.

[23] Clark, J.A. (2003). Nature-Inspired Cryptography: Past,

Present and Future. In Proceedings of Conference on

Evolutionary Computation, 8-12 December. Canberra,

Australia.

[24] Clark, A. (1998). Optimization Heuristics for Cryptology.

Ph.D. Dissertation, Faculty of Information Technology,

Queensland University of Technology, Australia.

http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=5479283&queryText%3Dpso+cryptography%26openedRefinements%3D*%26searchField%3DSearch+All

http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=5479283&queryText%3Dpso+cryptography%26openedRefinements%3D*%26searchField%3DSearch+All

http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=5472208


http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=4054717&queryText%3Dcryptography+with+ACO%26openedRefinements%3D*%26searchField%3DSearch+All





http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=5643521&queryText%3Dcryptography+with+PSO%26openedRefinements%3D*%26searchField%3DSearch+All








http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=4699061&queryText%3Dcryptography+with+artificial+neural+network%26openedRefinements%3D*%26searchField%3DSearch+All







SIP0303-7

[25] Bagnall, A.J. (1996). The Applications of Genetic

Algorithms in Cryptanalysis. M.Sc. Thesis. School of

Information System, University of East Anglia.

[26] Dimovski, A., Gligoroski, D. (2003). Attack on the

Polyalphabetic Substitution Cipher Using a Parellel Genetic

Algorithm. Technical Report, Swiss-Macedonian Scientific

Cooperation through SCOPES Project, March 2003, Ohrid,

Macedonia.

[27] Dimovski, A., Gligoroski, D. (2003). Attacks on

Transposition Cipher Using Optimization Heuristics. In

Proceedings of ICEST 2003, October, Sofia, Bulgaria.

[28] Morelli, R.A. and Walde, R.E. (2003). A Word-Based

Genetic Algorithm for Cryptanalysis of Short Cryptograms.

Proceedings of the 2003 Florida Artificial Intelligence

Research Symposium (FLAIRS – 2003), pp. 229-233.

[29] Morelli, R.A., Walde, R.E., Servos, W. (2004). A Study

of Heuristic Search Algorithms for Breaking Short

Cryptograms. International Journal of Artificial Intelligence

Tools (IJAIT), Vol. 13, No. 1, pp. 45-64, World Scientific

Publishing Company.

[30] Servos, W. (2004). Using Genetic Algorithm to Break

Alberti Cipher. Journal of Computing Science in Colleges,

Vol. 19(5): 294-295.

[31] Hernandez, J.C., Sierra, J.M., Isasi, P., Ribagorda, A.

(2002). Genetic Cryptanalysis of Two Rounds TEA. ICCS

2002, LNCS 2331, 1024 – 1031, Springer-Verlag Berlin

Heidelberg.

[32] Ali, H. and Al-Salami, M. (2004). Timing Attack

Prospect for RSA Cryptanalysis Using Genetic Algorithm

Technique. The International Arab Journal of Information

Technology, 1(1).

[33] Millan, W., Clark, A. and Dawson, E. (1997). Smart Hill

Climbing Finds Better Boolean Functions. Proceedings of. 4th

Annual Workshop on Selected Areas in Cryptography, Aug.

11-12, SAC 1997.

[34] Millan, W., Clark, A. and Dawson, E. (1998). Heuristic

Design of Cryptographically Strong Balanced Boolean

Functions. Advances in Cryptology – EUROCRYPT ‟98,

LNCS 1403, 489-499, Springer-Verlag, Berlin Heidelberg.

[35] Dimovski, A., Gligoroski, D. (2003). Generating Highly

NonLinear Boolean Functions Using a Genetic Algorithm. In

Proceedings of 1st

Balcan Conference on Informatics,

November, Thessaloniki, Greece.

.


SIP0304-1

EYE BASED CURSOR MOVEMENT USING EEG IN

BRAIN COMPUTER INTERFACE Tariq S Khan

#, Mudassir Ali

#, Omar Farooq

#, Yusuf U Khan

*,

#Department of Electronics Engineering, Zakir Husain College of Engineering & Technology

*Department of Electrical Engineering, Zakir Husain College of Engineering & Technology

Aligarh Muslim University, Aligarh

Abstract— The aim of this study is to

detect eye movement (left to right) from

Electroencephalograph (EEG) signal.

Four electrodes of EEG in the frontal

area were used. The statistical features

were extracted from the four channels of

frontal channel. These features were then

fed into a classifier based on the linear

discriminator function. The most

prominent features for the classification

of left and right movements were

identified. These features were then

interfaced with computer so that cursor

movement can be controlled. Electrodes

are placed along the scalp following the

10-20 International System of Electrode

Placement. Recorded data was filtered,

windowed and analysed in order to

extract features. Four different classifiers

were used. Best results were found in

support vector machine (SVM) and linear

classifiers each of which gave the average

accuracy of 90%.

Keywords: BCI, Eye movement, EEG.

I. INTRODUCTION

A brain-computer interface (BCI)

provides an alternative communication

channel between the human brain and a

computer by using pattern recognition

methods to convert brain waves into control

signals. Patients who suffer from severe

motor impairments (severe cerebral palsy,

head trauma and spinal injuries) may use

such a BCI system as an alternative form of

communication by mental activity [1]. Using

improved measurement devices, computer

power, and software, multidisciplinary

research teams in medicine,

psychophysiology, medical engineering, and

information technology are investigating and

realizing new noninvasive methods to

monitor and even control human physical

functions.

In a bigger picture – there can be devices

that would allow severely disabled people to

function independently. For a quadriplegic,

something as basic as controlling a computer

cursor via mental commands would

represent a revolutionary improvement in

quality of life. With an EEG or implant in

place, the subject would visualize closing his

or her eyes or moving eyes from left to right

and vice versa [2]. The software can learn

eye movement through training, using

repeated trials. Subsequently, the classifier

may be used to instruct the closure/opening

of eye. A similar method is used to

manipulate a computer cursor, with the

subject thinking about forward, left, right

and back movements of the cursor [3]. With

enough practice, users can gain enough

control over a cursor to draw a circle, access

computer programs and control a television.

It could theoretically be expanded to allow

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27

2011

SIP0304-2

users to "type" with their thoughts. This can

be achieved by controlling cursor movement

on a computer screen through EEG signals

from brain, specifically, generated due to

eye movement. The signals can be analysed

by different methods.

Traditional analysis methods, such as the

Fourier Transform and autoregressive

modelling are not suitable for non-stationary

signals. Recently, wavelets have been used

in numerous applications for a variety of

purposes in various fields. It is a logical way

to represent and analyse a non-stationary

signal with variable sized region windows

and to provide local information. In the

Fourier Transform (FT), the time

information is lost and in short Term Fourier

Transform (STFT) there is limited time

frequency resolution. Even though basic

filters can be used for decomposition of

desired bands, ideal filters are never realised

in practice, which results in aliasing effects.

However, wavelet analysis enables perfect

decomposition of the desired bands, which

helps us to obtain better features [4].

In this paper different features are used

for training the classifier for eye movement

in left and right directions. A time-frequency

analysis was applied to the EEG signals

from different channels, to determine

combination of features and channels that

yielded the best classification performance.

II. BACKGROUND RESEARCH

EEG waves are created by the firing of

neurons in the brain and were first measured

by Vladimir Pravdich-Neminsky who

measured the electrical activity in the brains

of dogs in 1912, although the term he used

was ―electrocerebrogram.‖ Ten years later

Hans Berger became the first to measure

EEG waves in humans, in addition to giving

them their modern name, began what would

become intense research in utilizing these

electrical measurements in the fields of

neuroscience and psychology.

The term ―Brain-Computer Interface‖ first

appeared in scientific literature in the 1970's,

though the idea of hooking up the mind to

computers was nothing new [5]. Currently,

the systems are ―open loop‖ and responds to

user‘s thoughts only. The ―closed loop‖

systems are aimed to be developed that can

give feedback to user as well.

In order to meet the requirements of the

growing technology expansion, some kind of

standardization was required not only for the

guidance of future researchers but also for

the validation and checking of new

developments with other systems, thus a

general purpose system was developed

called BCI2000 which made analysis of

brain siganl recording easy by defining the

output formats and operating protocols to

facilitate the researchers in developing any

type of application. This made it easier to

extract specific features of brain activity and

translate them into device control signals

[7]..

III. OUR METHODOLOGY

The procedure in this study was to initially

acquire EEG data. The stored data was then

pre-processed to remove artifacts.

Subsquently features were extracted in the

clean EEG and used for classification. Thus

methodology is shown in Fig. 1.

Fig. 1: Block diagram for feature extraction and device

control of eye movement

Data acquisition

Data Processing

Feature Extraction

Classification

Device/Application Control


2011

SIP0304-3

A.Experimental Setup and Data Acquisition

The subject was seated on wooden armchair

and legs were rested on wooden footrest

(wooden items should be used so as to

reduce interference) with eyes closed. The

subject was instructed to avoid speaking and

to avoid body movement in order to ensure

relaxed body. EEG data were recorded using

a Brain Tech clarityTM

system [9] with the

electrodes positioned according to the

standard 10-20 system in the biomedical

Signal Processing lab, AMU Aligarh.

To ensure the same rate of eye movement in

both directions, a ball was shown on the

screen and the subject was asked to visually

follow the ball. The movement of ball was

set to 60 pixels per second. A series of trials

were recorded.

The subject was instructed to open eyes

slowly and then to follow the movement of

the ball in the program on prompt from the

experimenter. Movement of eyes was

recorded for two different directions i.e. left

to right and right to left. Block diagram of

experimental procedure is shown in fig. 2.

Fig. 2 Sequence followed during experimental recording

B. Data Processing

26 channels of EEG were recorded. Since

only frontal lobe is mainly involved in eye

movement, only those channels that are

associated with the frontal lobe i.e. FP1-F3,

FP1-F7, FP2-F4, FP2-F8 were analysed. The

signal values associated with these signals

were extracted in ASCII form using

BrainTech software. EEG of the frontal lobe

channels for subject 1 is illustrated in Fig. 3.

Fig. 3: Plot of channels associated with frontal lobe

50 Hz power supply often causes

interference in the EEG recording. Fig. 4

shows a plot of PSD on the EEG record of

FP1-F3 channel. To eliminate these spikes

signal was passed through Infinite Impulse

Response (IIR) notch filter before analysis.

Fig. 4: Power Spectral Density of FP1F3

IIR second order notch filter with the

quality factor (or Q factor) of 3.91 was used

to remove the undesired frequency

components.

Signal after removing the artifacts of 4

channels stacked over one another is shown

in Fig. 5.

Relax

Left to right

movement Relax

Right to left

movement

50 100 150 200 250 300 350 400 450 500

0

500

1000

1500

No. of Samples

Am

pli

tud

e

Frontal lobe channels

fp1f3

fp1f7

fp2f4

fp2f7

0 20 40 60 80 100 120 140-40

-30

-20

-10

0

10

20

30

Frequency (Hz)

Po

wer

(dB

)

PSD before & after Passing Through Notch

PSD of FP2F8

PSD after passing

through Notch Filter


2011

SIP0304-4

Fig. 5: Signal Plot of filtered Frontal lobe associated

channels

EEG by nature is non stationary signal. So it

was fragmented into frames so that it can be

assumed stationary for small segment. EEG

data is divided into frames of 1s duration i.e.

frame size of 256 samples.

C. Feature extraction

Feature extraction is the process of

discarding the irrelevant information to the

possible extent and representing relevant data

in a compact and meaningful form. Two eye

movements were recorded: right to left

(RTL), left to right (LTR).Standard statistical

parameters such as mean, variance,

skewness, cross-correlation were calculated

for all the channels in each movement type.

D. Classification

Following classifiers were used to classify

the two eye movements:

SVM: It is non-probabilistic binary linear

classifier.

Linear: Fits a multivariate normal density to

each group, with a pooled estimate of

covariance.

Diaglinear: Similar to 'linear', but with a

diagonal covariance matrix estimate (naive

Bayes classifiers).

Quadratic: Fits multivariate normal densities

with covariance estimates stratified by

group.

E. Cursor Control

A program was written which controls the

cursor movement according to instruction

given. This program will be calibrated

according the instructions given i.e. the

cursor movement will be invoked instead of

mouse movement as the instruction same as

that of the mouse movement. This

instruction will then be interfaced with the

eye movement which will then control the

movement of cursor [9].

IV. RESULTS AND DISCUSSIONS

For each frame of EEG, four features were

calculated namely, variance, mean, skewness

and cross correlation. The seperabrability

provided by each feature was individually

tested. The best three features were

subsequently used as an input to the

classifier. Four classifiers were used in this

work. The classifiers results are illustrated in

Table 1.For each movement of LTR and

RTL 20 seconds (20 frames) of data were

collected. From these 20 frames 15 frames

were used for training and rest 5 are used for

testing for both movements.

Table 1: Percentage accuracy of classification for eye

movements

Classifier RTL LTR

SVM 80 100

Linear 80 100

Quad 60 40

Diaglinear 80 60

From the observations in Table1 it can be

seen that linear or SVM classifier gives the

best possible results with high classification

percentage accuracy for both eye

movements.

0 100 200 300 400 500-500

0

500

1000

1500

2000Signal after passing through Notch filter

fp1f3

fp1f7

fp2f4

fp2f8


2011

SIP0304-5

Fig. 6: Plot of Classifier in Signal Space

A linear classifier classifying both eye

movements is shown in Fig. 6.

Fig. 7: Variance Plot of FP2-F4

From Fig. 6 which shows the variance for the

channel FP2-F4 clearly shows that the

variance of LTR is greater than RTL for

most of the time. Variance basically shows

the concentration of probability density

function about the mean.

V. CONCLUSIONS

EEG data was investigated for two eye

movements using a 4 channel setup on three

subjects. Features were extracted from the

variance for both the movements. A linear

classifier was used to classify between the

two eye movements. These algorithms can

provide high classification accuracy only

after training for few sessions. In this work

90% of accuracy has been achieved, in

classifying the two movements (RTL &

LTR).

ACKNOWLEDGEMENT

The authors are indebted to the UGC. This work is a part of the funded major research project C.F. No 32-14/2006(SR)

.

REFERENCES

1. The "10-20 System of Electrode Placement‖ http://faculty.washington.edu/chudler/1020.html

2. Y. U. Khan,(2010) ‘Imagined wrist movement classification in

single trial EEG for brain computer interface using wavelet

packet‘, Int. J. Biomedical Engineering and Technology, Vol. 4, No. 2, pp169-180.

3. Daniel, J. Szafir (2009-10) ‗Non-Invasive BCI through EEG ―An Exploration of the Utilization of electroencephalography to Create

Thought-Based Brain-Computer Interfaces‖.

4. Wolpaw, J.R., Birbaumer, N., McFarland, D.J., Pfurtscheller, G., Vaughan, T.M. (2002): Brain–computer interfaces for communication and control. Clinical Neurophys. pp767–791

5. Y. U. Khan and O. Farooq(2009), ―Autoregressive features based classification for seizure detection using neural network in scalp

Electroencephalogram‖, International Journal of Biomedical

Engineering and Technology, vol.2, no. 4, pp. 370-381.

6. J. Vidal(1973) "Toward Direct Brain–Computer Communication." Annual Review of Biophysics and Bioengineering. Vol. 2, pp. 157-

180

7. Syed M.Siddique, Laraib Hassan Siddique (2009): EEG based Brain computer Interface: Journal of software, vol.4, no.6, pp.550-

555

8. EEG Channels in Detecting Wrist Movement Direction Intention:

Proceedings of the 2004 IEEE Conference on Cybernetics and Intelligent Systems

9. Fabiani, Georg E. et al. Conversion of EEG activity into cursor movement by a brain-computer interface.

<http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.128.5914>. 2004

10. Clarity Braintech system, Standard edition, Software version 3.4, Hardware version 1.4, Clarity Medical Private Limited

-15 -10 -5 0 5 10 15 2010

20

30

40

50

60

RTL

LTR

Support Vectors

Classifier

0 5 10 15 200

200

400

600

800

1000

Time(sec)

Va

ria

nce

Variance of FP2F4

RTL

LTR


SIP0333-1

Abstract—Due to changing trends, there is an

increasing risk of people having Cardiac

Disorders. This is the impetus behind, for

developing a system which can diagnose the

cardiac disorder and also risk level of the

patient, so that effective medication can be

taken in the initial stages. This paper helps in

comprehensive diagnosis of the patient without

the doctor in the same geographical

location.This will prove to be advantageous for

implementation in villages where doctors are

not easily accessible. In this paper, Atrial rate,

Ventricular rate, QRS Width and PR Interval

are extracted from ECG signal, so that

arrhythmia disorders- Sinus tachycardia (ST),

supra-ventricular tachycardia (SVT),

ventricular tachycardia (VT), junctional

tachycardia (JT), ventricular and Atrial

fibrillation (VF & AF) are diagnosed with their

respective risk levels. So that the system acts as

an risk analyzer, which tells how far the

subject is prone to arrhythmia. LabVIEW

signal express is used to read ECG and for

analysis this information is passed to the Fuzzy

Module. In the Fuzzy module Various ―If-then

rules‖ have been framed to identify the risk

level of the patient. The Extracted information

is then published to the client from the server

by using a Online publishing tool. After

passing the report developed by the system to

the doctor,he or she can pass the medical

advice to the server, i.e. generally the system

where the patient ECG is extracted and

analyzed.

Index Terms–LabVIEW,Arrhythmia- Sinus

tachycardia (ST), supra-ventricular tachycardia

(SVT), ventricular tachycardia (VT), junctional

tachycardia (JT), ventricular and Atrial

fibrillation (VF & AF, Online publishing tool,

QRS width, trial rate,ventricular rate

I. INTRODUCTION

According to the World Health Organization

(WHO) heart disease and stroke kills around 17 million people a year, which is almost one-third of all deaths globally. By 2020, heart disease and stroke will become the leading cause of both death and disability worldwide. So, it is very clear that proper diagnosis of heart disease is important for patients to survive. Electrocardiogram (ECG) is an important tool for Diagnosis of heart diseases .But it has some drawbacks such as: 1) Special skill is required to administer and interpret the

results of ECG.

2) Cost of ECG equipment is high.

K.A.Sunitha1, N.Senthil kumar

2, K.Prema

3, Sandeep Kotikalapudi

4

1&3 Assistant professor, Instrumentation and Control Engineering Department, SRM University,

2Professor, Mepco schlenk Engineering College, Sivakasi,

4 Student, Instrumentation and Control Engineering Department, SRM University

[email protected],

[email protected],

[email protected],

[email protected]

AN INTERNET BASED INTELLIGENT

TELEDIAGNOSIS SYSTEM FOR

ARRHYTHMIA


SIP0333-2

3) Limited availability of ECG equipment. Due to these drawbacks, telemedicine contacts were mostly used for consultations between special telemedicine centres in hospitals and clinics in the past. More recently, however, providers have begun to experiment with telemedicine contacts between health care providers and patients at home to monitor conditions such as chronic diseases [1].

LabVIEW (Laboratory Virtual Instrument Engineering Workbench) is a graphical programming environment suited for high-level or system-level design. As it has been proven that LabVIEW based telemedicine system does have the following features.

1) It replaces multiple stand-alone devices at the

cost of a single instrument using virtual instrumentation and its functionality is expandable [2].

2) It facilitates the extraction of valuable diagnostic information using embedded advanced biomedical signal processing algorithms [2].

3) It can be connected to the internet to create an internet –based telemedicine infrastructure, which provides a comfortable way for physicians to communicate with friends, family and colleagues [3].

Several systems had been developed on acquisition and analysis of ECG [4]-[8] using labVIEW . Some systems [5] and [7],[8] also dealed with identifying the cardiac disorder but it lacks , identifying the risk levels of the patient for the cardiac disorder and the online publishing system.

In this paper, we developed a program not only to access the patient’s data but also we had tried to diagnose the heart abnormalities, which can be a reference to the doctor or physician for further procedure. This can be taken up from anywhere if an internet connection is available. And a fuzzy system

is developed to identify the risk level of the patient. . Fuzzy system is more accurate than the normal controller because instead of being either true or false, a partial true case can also be declared. The risk scores can be accurately and exactly calculated for specific records of a person.

II. PROPOSED SYSTEM

Figure 1. Shows the proposed Fuzzy analyser

with online

System.

Fig 1. proposed system

The ECG waveforms are obtained from MIT-BIH Database.LabVIEW signal express is used to read and make analysis of the ECG and pass the information to the Fuzzy Module. In the Fuzzy module Various “If-then rules ” have been written to identify the risk level of the patient.The Extracted information is then published to the client from the server by using different Online publishing tools. After passing the information i.e, Atrial rate, Ventricular rate, QRS Width and PR Interval which were extracted from ECG signal,from patients system to the doctor’s system the doctor can pass the medical advice to the server, i.e. generally the system where the patient ECG is extracted and analyzed.

A. Internet based System:

The internet is used as a to and fro vehicle to deliver both the virtual medical instruments, medical data and prescription from the doctor in real time .An internet-based telemedicine system is shown in fig:2. This work involves an internet –based telemonitoring system, which has been developed as an instance of the general client-server architecture presented in fig:.

The client server architecture is defined as follows: the client application provide visualization, archiving, transmission, and contact facilities to the remote user (i.e., the patient). The server, which is


SIP0333-3

located at the physicians end takes care of the incoming data, and organizes patient sessions.

Fig.2 Internet based system

B. LABVIEW

LabVIEW is a graphical programming language

developed by National instruments. Programming

with LabVIEW gives a vivid picture of data flow by

the graphical representation in blocks. labview is

used here for getting the ECG waveform and also

for analyzing the parameters like PR interval, QRS

width, heart rates which are later passed to the fuzzy

system.

LabVIEW offers modular approach and parallel

computing , which makes easier for developing

complex systems. Debugging tools like probes,

Highlight execution are handy in analyzing where

actually the error occurred.

C. Fuzzy system

Fuzzy controllers are the widely employed as they

are efficient controllers when working with the

vague values. A Fuzzy controller has a rule base in

“IF-THEN” fashion, which is used for identification

of the risk level of disease using the weight.

A Fuzzy system is generally given by Fig 3.

Fig 3. Fuzzy system

A. Fuzzification

In this system we are considering the atrial and ventricular heart rates, QRS complex width and PR interval values as the input linguistic variables, which are passed to the inference engine.

Based on the rule base and linguistic variables, the fuzzy system output is obtained.

B. Defuzzification

The defuzzified values are the risk levels high risk, medium risk, low risk which are obtained according to the weights of fuzzy variables.

C. Relation between input and output variables

The relationship between input and output is shown by a 3-Dimensional figure 4. shown below

Fig 4. Relation between input and output

D. Fuzzy Rules

In this Fuzzy system we are using the centre of area method as the fuzzificaton method. The rule base of the fuzzy system consists of rules in the form of “If-Then”. The risk levels are dependent on the number of conditions are met by the input variables for the respective cardiac disorder. As there is no particular rule of identifying the arrhythmia based on heart rate, since it can differ from patient to patient and so this system thus is more accurate in determining the arrhythmia since it is not based only on heart rate.


SIP0333-4

Fuzzy rule base is acts like a database of rules for selecting the output, basing on the input quantities. Some of the rules are:-

1. IF 'PR interval' IS 'Normal' AND 'vHR' IS '30,40' AND 'aHR' IS '60,75' THEN 'First Degree Block' IS 'No ' ALSO 'Third Degree block' IS 'Medium Risk' 2. IF 'PR interval' IS 'Normal' AND 'vHR' IS '30,40' AND 'aHR' IS '75,90' THEN 'First Degree Block' IS 'No ' ALSO 'Third Degree block' IS 'Medium Risk' 3. IF 'PR interval' IS 'Normal' AND 'vHR' IS '30,40' AND 'aHR' IS '90,100' THEN 'First Degree Block' IS 'No ' ALSO 'Third Degree block' IS 'High Risk'.

4. IF 'vHR' IS '150,180' AND 'QRS Width' IS 'Narrow QRS' THEN 'Ventricular Tachycardia at' IS 'Low risk' ALSO 'Junctional Tachycardia at' IS 'Low Risk' ALSO 'Supra Ventricular Tachy at' IS 'High Risk' 5. IF 'vHR' IS '180,210' AND 'QRS Width' IS 'Normal QRS' THEN 'Ventricular Tachycardia at' IS 'Low risk' ALSO 'Junctional Tachycardia at' IS 'High Risk' ALSO 'Supra Ventricular Tachy at' IS 'Low Risk'

In this manner, based upon the PR interval,QRS width, atrial and Ventricular heart rates a Fuzzy system is developed to identify the Cardio disorder as well as its level of risk.

III. ONLINE PUBLISHING

One of the Unique feature of this system is its ability to publish or pass the extracted information to the Client, usually to a doctor`s computer. This helps in implementing a telediagnosis system. The doctor will be able to see the diagnosis result along with risk levels and then pass the information to the doctor for further advice. Since internet issued for passing the values to the doctor ,This becomes immensely help for immediate action to be taken. This will cater to the need of public health care centres rural areas where it is difficult to have cardiologists. And also this system can be used to assist the doctor in monitoring the patient’s heart during surgery.

IV. RESULTS

This system is able to measure the arrhythmias accurately and also publish it online.

Fig 5. Block Diagram for extracting

ECG waveform

In the above Fig 5 block diagram , it perform the

function of passing the HR value obtained from the

signal express to the fuzzy system .

Fig 6. Block diagram for calling fuzzy system in labVIEW


SIP0333-5

Above figure 6. Shows the block diagram of risk level detection , we show how we called the fuzzy system into the main panel for diagnosing and risk level indication.

Fig 7 shows the Front panel which is developed from the fuzzy system ,and is sent to the doctor using web publishing tool for the second advice .System also have a database to save the details of patient like Name, Age, Sex, Symptoms which can used for the next time..

Fig 7. Front panel

V. CONCLUSION

In this way we had developed a fuzzy system with good accuracy in determining the cardiac disorders with risk levels when compared to the normal system considering the atrial and ventricular heart rates, QRS complex width and PR interval values as the input linguistic variables using labVIEW. This report is successfully sent to the doctors system using web publising tool for the second advice.

REFERENCES:-

[1] N.Noury and P.Pilichowski,”A telematic system tool for home health care,”-in proc. IEEE 14

th

Annu.Int.conf.EMBS, Paris, oct.1992, PP.1175-1177

[2] Zhenyu Guo and John c.Moulder “An internet based Telemedicine system”IEEE transactions, pp. 2000

[3].Volodymyr Hrusha, Olexandr Osolinskiy,

Pasquale Daponte, Domenico Grimaldi”Distributed

Web-based Measurement system” IEEE Workshop

on Intelligent Data and Advanced Computing

System Technology and Applications pp, on 5-7

2005

1. Acquisition and Analysis System of the ECG

Signal Based on LabVIEW by Lina Zhang,

Xinhua Jiang.

2. QRS DETECTION USING A FUZZY NEURAL

NETWORK Kevin P. Cohen, Willis J.

Tompkins, Adrianus Djohan, John G. Webster

and Yu H. Hu.

3. Classification of ECG Arrhythmias using Type-2

Fuzzy

Clustering Neural Network

4. Robust techniques for remote real time

arrhythmias classification system

5. ECG Arrhythmia Detection Using Fuzzy

Classifiers by

S. Zarei Mahmoodabadi ,A. Ahmadian, M. D.

Abolhassani, J. Alireazie P. Babyn

6. Discrimination of Cardiac Arrhythmias Using a

Fuzzy Rule-Based Method by E Chowdhury,

LC Ludeman.

7. Automated ECG Rhythm Analysis Using Fuzzy

Reasoning by W Zong, D Jiang.

8. Fuzzy Classification of Intra-Cardiac

Arrhythmias by Jodie Usher, Duncan Campbell,

Jitu Vohra, Jim Cameron.


SIP0401-1

Projected View & Novel Application of Context Based Image Retrieval Techniques

Shivam Agrawal#, Rajeev Singh Chauhan*, Vivek Vyas**

#B.Tech Student, CS Department, *B.Tech Student, CS Department, **M.Tech Student, ECE Department,

Arya College of Engineering and I.T., Kukas, Jaipur,

Rajasthan Technical University, Kota #[email protected], *[email protected] and **[email protected]

Abstract— Image searching is one of the fascinating topics for the advanced research since the 1990s. As fast there is advancement in the computer and network technologies coupled with relatively cheap high volume data storage devices have brought tremendous growth in the amount of digital images, hence the development of pattern recognition is also increases exponentially. Pattern recognition is the act of taking in raw data and classifying it into predefined categories using statistical and empirical methods. Content based image retrieval (CBIR) is one of the widely used applications of pattern recognition for finding images from vast and un-annotated image database. In CBIR images are indexed on the basis of low-level features, such as color, texture, and shape, which can automatically be derived from the visual content of the images. The paper discusses techniques and algorithm that are used to extract these image features from the visual content of the images & the advancement which can be done using the CBIR. The various similarity measures are used to identify the closely associated patterns. These methods compute the distance between the features generated for different patterns and identify the closely related patterns and these patterns are then generated as the result. This paper unfolds a novel application using context based image retrieval for search the detailed description of an image without knowing a single word about it. This paper also proposes algorithms to create such a utility. Keywords: Context Based Image Retrieval, Image Searching.

INTRODUCTION

The initial techniques which are used are based on the textual annotation of the images. Using the text descriptions, images can be organized by topical or semantic hierarchies to facilitate easy navigation and browsing based on standard Boolean queries. Content Based Image Retrieval is one of the major approaches of image retrieval that has drawn significant attention in the past decade, which uses visual contents to search images from large scale image database according to users interests Low Level image features such as color, texture, shape and structure are extracted from images. Relevant images are retrieved based on the similarity of their image features. Examples of some of the prominent systems are QBIC, Photobook, and NETRA. In this paper we discuss the different algorithms used to extract the different features of an image. In this paper we also discuss the future advancement of the Context Based Image Retrieval techniques, how can be it beneficial in different fields. We also discuss the futuristic approaches to attain this technique in more advanced way.

1. Image Retrieval

A recent study of literature in image indexing and retrieval has been conducted based on 100 papers from Web of Science. Two major research approaches, text-based (description-based) and content-based, were identified. It appears that researchers in the information science community focus on the text-based approach while researchers in computer science focus on the content-based approach. Text-based image retrieval (TBIR) makes use of the text descriptors to retrieve relevant images. Some recent studies found that text descriptors such as time, location, events, objects,


SIP0401-2

formats, aboutness of image content, and topical terms are most helpful to users. The advantage of this approach was that it enabled widely approved text information retrieval systems to be used for visual retrieval systems. 1.1. Content-based image retrieval

In CBIR, the images are indexed by features that are derived directly from the images. The features are always consistent with the image and they are extracted and analyzed automatically by means of computer processing, instead of manual annotation. Due to the difficulty of automatic object recognition, information extracted from images in CBIR is rather low level, such as colors, textures, shapes, structure and combinations of the above. A number of representative generic CBIR systems have been developed in the last ten years. These systems have been implemented in different environments, some of which are Web based while some are GUI-based applications. QBIC, Photobook, and NETRA are the most prominent examples. QBIC is developed at the IBM Almaden Research Centre [1, 2, 3]. It is the first commercial CBIR application and plays an important role in the evolution of CBIR systems. The QBIC system supports low level image features of average color, color histogram, color layout, texture and shape. Additionally, users can provide pictures or draw sketches as example images in query. The visual queries can also be combined with textual keyword predicates. Photobook [4], developed at the MIT Media Lab. It is a tool for performing queries on image databases based on image content. It works by comparing features associated with images, not the images themselves. These features are in turn the parameter values of particular models fitted to each image. These models are commonly color, texture, and shape, though Photobook will work with features from any model. Features are compared using one out of a library of matching algorithms that Photobook provides. It is a set of interactive tools for searching and querying images. It is divided into three specialized systems, namely Appearance Photobook (face images), Texture Photobook, and Shape Photobook, which can also be used in combination. The features are compared by using one of the matching algorithms. These include Euclidean, Mahalanobis, divergence, vector space angle, histogram, Fourier peak, and wavelet tree distances, as well as any linear combination of those previously discussed. NETRA is a prototype image retrieval system that has been developed at them University of California, Santa

Barbara (UCSB) [5, 6]. NETRA supports features of color, texture, shape, and spatial information of segmented image regions to region-based search. Images are segmented to homogenous regions. Using the region as the basic unit, users can submit queries based on features that combine regions of multiple images. For example, a user may compose queries such as retrieve all images that contain regions having color of a region of image A, texture of a region of image B, shape of a region of image C. 1.1.1 Image features One of the main foci in CBIR is the means for extraction of the features of the images and evaluation of the similarity measurement between the features. Image features refer to the characteristics which describe the contents of an image. In this paper, image features are confined to visual features that are derived from an image directly. There have been extensive studies of various sorts of visual feature. The simplest form of visual feature is directly based on pixel values of the image. However, these types of visual feature are very sensitive to noise, brightness, hue and saturation changes, and are not invariant to spatial transformations such as translation and rotations. As a result, CBIR systems that are based on pixel values do not generally have satisfactory results. Much of the research in this area has placed the emphasis on computing useful characteristics from images using image processing and computer vision techniques. Usually, general purpose features in CBIR have included Text, color, texture, shape and Layout. Color representations Color histogram is the standard representation of color feature in CBIR system, initially investigated by Swain and Ballard. The histograms of intensity values are used to represent the color distribution. This captures the global chromatic information of an image and is invariant under translation and rotation about the view axis. Despite changes in view, change in scale, and occlusion, the histogram changes only slightly. A Color histogram H (M) of image M is a 1-D discrete function representing the Probabilities of occurrence of colors in images, which, is typically defined as: H (M) = [ ]


SIP0401-3

= k= 1, 2, 3 …. , n [Equation 1]

Where N is the number of pixels in image M and is the number of pixels with image value k. The division normalizes the histogram such that:

= 1.0 [Equation 2]

Texture representations

Many texture features have been investigated in the past, including the conventional pyramid-structured wavelet transform (PWT) features, tree-structured wavelet transform (TWT) features, the multi-resolution simultaneous autoregressive model (MR-SAR) features and the Gabor wavelet features. Experiments have been conducted and have found that the Gabor features [7, 8] produce the best performance. The computation of Gabor features is given as follows. A two dimensional Gabor function can be formulated as:

G (x, y) = ( ) × exp [- ( ) + 2�jWx]

[Equation 3] A self-similar filter dictionary can be obtained as a mother Gabor wavelet G (x, y) by appropriate dilations and rotations of Eq. (2) as:

= G ( )

Where h = height of image, w = width of image, hside = (h-1)/2; wside = (w-1)/2

= (x – hside) cos (n�/k)

+ (y – wside) sin (n�/k)

= - (x – hside) cos (n�/k)

+ (y – wside) sin (n�/k)

a > 1; m, n are integers Given an image with luminance, I (m, n), Gabor decomposition can be obtained by multiplying the luminance by the magnitude of the Gabor wavelet:

| |= I ( )

[Equation 4] The mean and standard deviation of the magnitude of the transform coefficient are used to represent the texture feature for classification and retrieval purposes:


SIP0401-4

= [Equation 5]

=

[Equation 6] The Gabor feature vector is constructed by using

and as feature components:

Where S is the number of scales and K is the number of orientation. Shape representations

APPLICATION BASED ON CONTEXT BASED IMAGE RETRIEVAL AND WORKING PROCEDURE

The one of the future advancement of the CBIR is to develop a platform for the users on which someone upload a image, query processor calculate the distance between the images of the database & according to the closeness of the images(distance between the images) it shows the related results for that image. Let suppose I am a noob for Egypt and walking into the streets of Cairo. I saw a monument, and I am eager to know about that then I just capture the image of it and upload on an application of my mobile. The application processed the query image

and shows the output in the form of the detailed information about that monument. We can create a desktop and mobile application for this purpose. There is lots of GPL & Closed License driven projects on the Image Retrieval. Tineye, gazopa are the most famous & effective project website for the image search. These Projects are using different feature extract algorithm for Context Based Image Retrieval. But, the search results provided by these websites are limited to the other images results. If we upload an image of some celebrity, we got the other similar images of that celebrity but not about that person. Here we are giving the concept of an application which works as a combination of Tineye and Wikipedia. To achieve this goal we design our web crawlers such that whenever they are indexing the images into the database it will also index the data related to that image using the Meta character and some keywords based on different algorithms apply on that page. There might be a problem that the page contains a lot of words with a single image than how can we identify that which word is exact for that image. For achieving this we follow the procedures described below: (A) First of all filter out all the unuseful words like preposition, adjective etc. from the whole text. And then apply the given algorithms for assigning the priority to remaining words. (I) Words in the Meta data contain higher priority Instead of other words on the page. (II) Words in the top 3 or 4 lines contain the higher priority after the filtration. (III) The frequently repeated word on the page contains the higher priority. (IV) Words in the bold letters contain the higher priority. (B) Now we have an Image and some words which contain the top priority from each page. (C) I upload an image to search the related images and its description. (D) The Context Based Image Searching is done to find the related images. (E) After searching, the words are also collected along with the related images of the desired Image. (F) Now one more filtering algorithm is apply for finding the exact keyword related to that image, the frequency of each word is calculated from the different results. (G) Now we assign the top priority to the word which contains the highest frequency.


SIP0401-5

(H) That word goes to Wikipedia & shows the resultant description along with the query image.

CONCLUSION & FUTURE WORK

There are lots of methods for the features extraction in the Context Based Image Retrieval. We can perform as many comparison algorithms for more exact search result. Here we discuss the color, texture and shape representations in the context based image retrieval. We also discuss to generate a application using the CBIR which can play the vital role in the current generation. There is a lot of space in the future advancement of the Context Based Image Retrieval. There are lots of application can be generated which can play a vital role in different fields. There are some visual abilities which is absent from the current CBIR & there is a scope to work on like perceptual organization, similarity between semantic concepts etc.

ACKNOWLEDGMENTS

The Authors gratefully acknowledges ARYA Development and Research Center, ACEIT, Jaipur.

REFERENCES

[1] M. Flickner, H. Sawhney and W. Niblack, Query by image and video content: the QBIC system, IEEE Computer September (1995). [2] J. Hafner, H.S. Sawhney, W. Equitz, M. Flicker and W. Niblack, Efficient color histogram indexing for quadratic form distance functions, IEEE Transactions on Pattern Analysis and Machine Intelligence 17(7) (1995) 729–36. [3] W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos and G. Taubin, The QBIC project: querying images by content using colors, texture and shape. In: W. Niblack (ed.), SPIE Proceedings Vol. 1908, Storage and Retrieval for Image and Video Databases, 2–3 February 1993, San Jose, California (SPIE, San Jose, 1993) 3173–87. [4] A. Pentland, R. Picard and S. Sclaroff, Photobook: content-based manipulation of image databases. Storage and Retrieval for Image and Video Databases II, number 2185, San Jose, CA., February 1994. [5] W.Y. Ma and B.S. Manjunath, NeTra: a toolbox for navigating large image databases, Multimedia Systems 7(3) (1999) 184–198. [6] B.S. Manjunath and W.Y. Ma, Texture features for browsing and retrieval of image data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(18) (1996) 837–42. [7] C.C. Chen and C.C. Chen, Filtering methods for texture discrimination, Pattern Recognition Letters 20(8) (1999) 783–90. [8] B.S. Manjunath and W.Y. Ma, Texture features for browsing and retrieval of image data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(18) (1996) 837–42.

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH

26-27 2011

SIP0402-1

Recursive Algorithm and Systolic Architecture for the Discrete Sine

Transform M.N. Murty,

Department of Physics

NIST, Berhampur-761008, Orissa, India

[email protected]

S.S. Nayak Department of Physics

JITM, Paralakhemundi, Orissa, India

[email protected]

Satyabrata Das Department of Electronics & Communication

NIST, Berhampur-761008, Orissa, India

[email protected]

B. Padhy Department of Physics

Khallikote (Auto) College, Berhampur-760001,

Orissa, India

[email protected]

S.N. Panda,

Department of Physics,

Gunupur College, Gunupur,

Orissa, India

Abstract - In this paper, a novel recursive

algorithm and a systolic architecture for

realising the discrete sine transform

(DST) are presented. By using some

mathematical techniques, any general

length DST can be converted into a

recursive equation. The recursive

algorithms apply to arbitrary length

algorithms and are appropriate for VLSI

implementation.

Keywords - discrete sine transform;

discrete cosine transform; recursive;

systolic architecture

I. INTRODUCTION

The Discrete sine transform (DST)

was first introduced to the signal processing

by Jain[1], and several versions of this

original DST were later developed by Kekre

et al.[2], Jain[3] and Wang et al.[4]. There

exist four even DST’s and four odd DST’s,

indicating whether they are an even or an

odd transform[5]. Ever since the

introduction of the first version of the DST,

the different DST’s have found wide

applications in several areas in Digital signal

processing (DSP), such as image

processing[1,6,7], adaptive digital

filtering[8] and interpolation[9]. The

performance of DST can be compared to that

of the discrete cosine transform (DCT) and it

may therefore be considered as a viable

alternative to the DCT. Yip and Rao[10]

have proven that for large sequence length

(N ≥ 32) and low correlation coefficient ( <

0.6), the DST performs even better than the

DCT.

In this paper, a novel algorithm to

convert DST into a recursive form and a

systolic architecture for parallel computation

of DST are presented. The advantage of this

algorithm is its regular structure and

parallelism, which makes it suitable for

implementation using VLSI techniques.

The rest of the paper is organised as

follows. The recursive algorithm for DST is

presented in Section-II. The comparison of

our results with other research works is

presented in Section-III. The systolic

architecture for computation of DST is

presented in Section-IV. Finally, we

conclude our paper in Section-V.


2011

SIP0402-2

II. THE PROPOSED RECURSIVE ALGORITHM

FOR DST

The DST of a sequence {x(n), n =

1,2,3,…, N} can be written as

N

n

knN

nxkX1

)12(2

sin)()(

for k = 1,2,3,…, N. (1)

Let N

kz , then

N

n znz

znz

nxkX1

2sin)cos(

2cos)sin(

)()( (2)

A time recursive kernel Vm for DST

is introduced as given below

Vm sin zN

mn

zmnnx )1(sin)(

(3)

=N

mn

zmnnxzmx1

)1(sin)(sin)(

=N

mn zmn

zzmnnxzmx

1 )1(sin

cos)(sin2)(sin)(

=N

mn

zmnnxzzmx1

)(sin)(cos2sin)(

zmnnxN

mn

)1(sin)(2

=

N

mn

zmnnxzzmx )(sin)(cos2sin)(

zmnnxN

mn

)1(sin)(2

= zVzVzzmx mm sinsin cos2sin)( 21

Hence, 21 cos2)( mmm VVzmxV

for m = 1,2,…, N and Vm= 0 for m >N (4)

The time recursive transfer function of

X (k) is obtained by multiplying (2) by sin z

N

nz

znz

zz

nz

nxzkX1

sin2

sin)cos(

sin2

cos)sin(

)(sin)(

N

n

nzz

zzz

nx1

)sin(2

sincossin2

cos)(

N

n

zznzznznx

1 2sinsin)cos(cos)sin()(

N

n

N

n

znz

nxnzz

nx11

)1(sin2

sin)()sin(2

sin)(

Using (3), we have

zVz

zVz

zkX sin2

sinsin2

sinsin)( 21

)(2

sin)( 21 VVN

kkX (5)

Equations (4) and (5) show that no

complex multiplication is required during

the recursive computation. Equation (5) is a

discrete time recursive transfer function of

finite duration input sequence, x(n), n = N,

N-1, …,2,1. As a consequence, X(k) is

obtained as the output of a finite impulse

response system. Fig. 1 shows the recursive

structure with the input sequence in reverse

order for the realisation of X(k).


2011

SIP0402-3

Figure 1. Recursive structure for computing the DST

III. COMPARISONS WITH RELATED WORKS

The proposed approach requires N

multiplications per point, and (2N-2)

additions per point for the realisation of N

length DST.

In Tables I and II, the number of

multipliers and the number of adders in the

proposed algorithm are compared with the

corresponding parameters based on the other

methods.

Table III gives the comparison of the

computation complexities of the proposed

algorithm with other algorithms found in the

related research works.

TABLE I

COMPARISON OF THE NUMBER OF MULTIPLIERS REQUIRED BY DIFFERENT ALGORITHMS

N [11] [13] [17] [19,20,23] [21] [12] [26] [22] Proposed

4 6 5 5 4 11 2 5 4 4

8 16 13 13 12 19 8 13 8 8

16 44 35 33 32 36 30 29 16 16

32 116 91 81 80 68 54 61 32 32

64 292 227 193 192 132 130 125 64 64

TABLE II

COMPARISON OF THE NUMBER OF ADDERS REQUIRED BY DIFFERENT ALGORITHMS

N [17] [13] [19,20,23] [11] [12] [21] [26] [22] Proposed

4 9 9 9 8 4 11 14 7 6

8 35 29 29 26 22 26 26 15 14

16 95 83 81 74 62 58 50 31 30

32 251 219 209 194 166 122 98 63 62

64 615 547 513 482 422 250 194 127 126

Input sequence

x(1), …, x(n-1), x(n)

2 cos N

k

x(1), …, x(n-

1), x(n)

-1 Z

-1

Z-1

Output X(k)

sin N

k

2

x(1), …, x(n-

1), x(n)


2011

SIP0402-4

TABLE III

COMPUTATION COMPLEXITIES

of multiplications of additions

Proposed algorithm N 2N-2

[13] (3/4) N log2N - N + 3 (7/4) N log2N - 2N + 3

[14,16,20,23] (1/2) N log2N (3/2) N log2N - N + 1

[15,24,25] N log2N /2 + 1 3 N log2 N / 2 -N +1

[18] (1/2) N log2N + (1/4) N-1 (3/2) N log2N + (1/2) N-2

[21] 2(N+3)(N-1) / N 2(2N-1)(N-1) / N

[22] (N+1)(N-1) / N (2N+1)(N-1) / N

[26] if N is even 2N-3 3N+2

[26] if N is odd 2(N-1) 3N+4

IV. SYSTOLIC ARCHITECTURE

The structure of the proposed linear

systolic array for computation of N-point

DST is shown in Fig. 2. It consists of (N+1)

locally connected processing elements (PEs)

of which the first N PEs are identical. The

recurrence relation given by (3) is

implemented in the first N PEs, while the

last PE computes the DST components.

Function of each of the first N PEs is shown

in Fig. 3 and that of the last PE is shown in

Fig. 4. One sample of the input data is fed to

each PE, one time-step staggered with

respect to the input of previous PE in the

reverse order i.e, i th input sample is fed to

(N+1-i) th PE in (N+1-i) th time-step. The

first output is obtained after (N+1) time steps

and the rest (N-1) output are obtained in

subsequent (N-1) time-steps. However,

successive sets of N-point DSTs are obtained

in every N time-steps. Each PE of the linear

array comprises of one multiplier and two

adders, while the last PE contains one adder

and one multiplier. The duration of the cycle

period is T = TM + 2TA, where TM and TA are,

respectively, the times involved in

performing one multiplication and one

addition in the PE. This architecture requires

N multiplications per point and (2N-2)

additions per point for realisation of N-point

DST. The hardware - and time-complexities

of the proposed systolic realisation along

with those of the existing structures [27] -

[31] are listed in Table IV.


SIP0402-5

x(n)

x(n-1)

0

2 cosz

1ST PE

2ND PE

(N-1) TH PE

N TH PE

V1

V2

(N+1) TH PE [S]

0 OUTPUT

0

Figure 2 . The linear systolic array for N- Point DST

2 cosz = 2 cos N

k

2

k k+1 in each time - step


2011

SIP0402-6

xin

ain

PE

aout

bin bout

cin cout

aout =ain

bout = xin + ain bin - cin

cout = bin

xin = Input sample

Figure 3. Function of each of the first N PEs of the linear array

Vout = (V1in + V2 in)S

S = sin N

k

2

k = 1 for first (N+1) time - steps. Then k k+1 in each time - step.

Figure 4. Function of (N+1)th PE of the linear array.

TABLE IV

HARDWARE - AND TIME-COMPLEXITIES OF PROPOSED STRUCTURE AND THE EXISTING SYSTOLIC STRUCTURES

FOR THE DST / DCT

Structures Multipliers Adders Cycle-Time (T) Average Computation

- Time

Pan and Park [27] N 2N TM + TA NT/2

Fang and Wu [28] N/2 + 3 N + 3 TM + 2TA NT

Chiper et al. [29] N-1 N+1 TM + TA (N-1) T/2

Meher [30] N/2 - 1 N/2 + 9 2(TM + TA) (N/4-1) T

Meher [31] N/2 + 3 N/2 +5 TM + TA (N/2-1) T

Proposed N 2N - 2 TM + 2TA (N+1) T


2011

SIP0402-7

V. CONCLUSION

In this paper, we proposed a

recursive algorithm, which is most suitable

for parallel computation of the DST. It

involves significantly less number of

multipliers and adders compared with that of

the existing structures. The proposed systolic

architecture is parallel, simple and regular,

which is suitable for VLSI implementation.

REFERENCES

[1] A.K. Jain, “A fast Karhunen-Loeve

transform for a class of random

processes,” IEEE Trans. Commun., vol.

COM-24, pp 1023-1029, September

1976.

[2] H.B. Kekre and J.K. Solanka,

“Comparative performance of various

trigonometric unitary transforms for

transform image coding,” Int. J.

Electron., vol. 44, pp 305-315, 1978.

[3] A.K. Jain, “A sinusoidal family of

unitary transforms,” IEEE Trans. Patt.

Anal. Machine Intell., vol. PAMI-I, pp

356-365, September 1979.

[4] Z. Wang and B. Hunt, “The discrete W

transform,” Applied Math Computat.,

vol. 16, pp 19-48, January 1985.

[5] S. Poornachandra, V. Ravichandran and

N.Kumarvel, “Mapping of discrete

cosine transform (DCT) and discrete

sine transform (DST) based on

symmetries” IETE Journal of Research,

Vol. 49, no. 1, pp 35-42, January-

February 2003.

[6] S. Cheng, “Application of the sine

transform method in time of flight

positron emission image reconstruction

algorithms,” IEEE Trans. BIOMED.

Eng., vol. BME-32, pp 185-192, March

1985.

[7] K. Rose, A. Heiman and I. Dinstein,

“DCT/DST alternate transform image

coding,” Proc. BLOBE COM 87, vol. I,

pp. 426-430, November 1987.

[8] J.L. Wang and Z.Q. Ding, “Discrete

sine transform domain LMS adaptive

filtering,” Proc. Int. Conf. Acoust.,

Speech, Signal Processing, pp 260-263,

1985.

[9] Z. Wang and L. Wang, “Interpolation

using the fast discrete sine transform,”

Signal Processing, vol. 26, pp 131-137,

1992.

[10] P. Yip and K.R. Rao, “On the

computation and the effectiveness of

discrete sine transform”, Comput.

Electron., vol. 7, pp. 45-55, 1980.

[11] W.H. Chen, C.H. Smith and S.C.

Fralick, “A fast computational

algorithm for the discrete cosine

transform”, IEEE Trans.

Communicat., vol. COM-25, no. 9,

pp. 1004-1009, Sep. 1977.

[12] P. Yip and K.R. Rao, “A fast

computational algorithm for the

discrete sine transform”, IEEE Trans.

Commun., vol. COM-28, pp. 304-

307, Feb. 1980.

[13] Z. Wang, “Fast algorithms for the

discrete W transform and for the

discrete Fourier transform”, IEEE

Trans. Acoust., Speech, Signal

Processing, vol. ASSP-32, pp. 803-

816, Aug. 1984.

[14] P. Yip and K.R. Rao, “Fast

decimation-in-time algorithms for a

family of discrete sine and cosine

transforms”, Circuits, Syst., Signal

Processing, vol. 3, pp. 387-408,

1984.


2011

SIP0402-8

[15] H.S. Hou, “A fast recursive

algorithm for computing the discrete

cosine transform”, IEEE Trans.

Acoust., Speech, Signal Processing,

vol. ASSP-35, no. 10, pp. 1455-

1461, Oct. 1987.

[16] O. Ersoy and N.C. Hu, “A unified

approach to the fast computation of

all discrete trigonometric

transforms,” in Proc. IEEE Int. Conf.

Acoust., Speech, Signal Processing,

pp. 1843-1846, 1987.

[17] H.S. Malvar, “Corrections to fast

computation of the discrete cosine

transform and the discrete hartley

transform,” IEEE Trans. Acoust.,

Speech, Signal Processing, vol. 36,

no. 4, pp. 610-612, Apr. 1988.

[18] P. Yip and K.R. Rao, “The

decimation-in-frequency algorithms

for a family of discrete sine and

cosine transforms”, Circuits, Syst.,

Signal Processing, vol. 7, no. 1, pp.

3-19, 1988.

[19] A. Gupta and K.R. Rao, “A fast

recursive algorithm for the discrete

sine transform” IEEE Transactions

on Acoustics, Speech and Signal

Processing, vol. 38, no. 3, pp. 553-

557, March, 1990.

[20] Z. Cvetković and M.V. Popović,

“New fast recursive algorithms for

the computation of discrete cosine

and sine transforms”, IEEE Trans.

Signal Processing, vol. 40, no. 8, pp.

2083-2086, Aug. 1992.

[21] J. Caranis, “A VLSI architecture for

the real time computation of discrete

trigonometric transform”, J. VLSI

Signal Process., no. 5, pp. 95-104,

1993.

[22] L.P. Chau and W.C. Siu, “Recursive

algorithm for the discrete cosine

transform with general lengths”,

Electronics Letters, vol. 30, no. 3,

Feb. 1994.

[23] Peizong Lee and Fang-Yu Huang,

“Restructured recursive DCT and

DST algorithms”, IEEE Transactions

on Signal Processing,” vol. 42, no.

7, pp. 1600-1609, July 1994.

[24] V. Britanak, “On the discrete cosine

computation”, Signal Process., vol.

40, no. 2-3, pp. 183-194, 1994.

[25] C.W. Kok, “Fast algorithm for

computing discrete cosine

transform”, IEEE Trans. Signal

Process., vol. 45, pp. 757-760, Mar.

1997.

[26] V. Kober, “Fast recursive algorithm

for sliding discrete sine transform”,

Electronics Letters, vol. 38, no. 25,

pp. 1747-1748, Dec. 2002.

[27] S.B. Pan and R.H. Park, “Unified

systolic array for computation of

DCT / DST / DHT”, IEEE Trans.

Circuits Syst. Video Technol., vol. 7,

no. 2, pp.413-419, April 1997.

[28] W.H. Fang and M.L. Wu, “Unified

fully-pipelined implementations of

one- and two-dimensional real

discrete trigonometric trnasforms”,

IEICE Trans. Fund. Electron.

Commun. Comput. Sci., vol. E82-A,

no. 10, pp. 2219-2230, Oct. 1999.

[29] D.F. Chiper, M.N.S. Swamy, M.O.

Ahmad, and T. Stouraitis, “A systolic

array architecture for the discrete

sine transform”, IEEE trans. Signal

Process., vol. 50, no. 9, pp. 2347 -

2354, Sept. 2002.

[30] P.K. Meher, “A new convolutional

formulation of the DFT and efficient

systolic implementation”, in Proc.

IEEE Int. Region 10 Conf.

(TENCON’05), pp. 1462-1466, Nov.

2005.


2011

SIP0402-9

[31] P.K. Meher, “Systolic designs for

DCT using a low-complexity

concurrent convolutional

formulation”, IEEE Trans. Circuits &

Systems for Video Technology, vol

16, no. 9, pp. 1041-1050, Sept. 2006.


2011

SIP0403-1

Multiscale Edge Detection Based on Wavelet Transform

Divesh Kumar, Dr. Yaduvir Singh


Thapar University, Patiala, Punjab

[email protected], [email protected]

Abstract: This paper presents a new approach

to edge detection using wavelet transforms.

First, we briefly introduce the development of

wavelet analysis. Then, some major classical

edge detectors are reviewed and interpreted

with continuous wavelet transforms. The

classical edge detectors work fine with high-

quality pictures, but often are not good enough

for noisy pictures because they cannot

distinguish edges of different significance. The

proposed wavelet based edge detection

algorithm combines the coefficients of wavelet

transforms on a series of scales and

significantly improves the results. Finally, a

cascade algorithm is developed to implement

the wavelet based edge detector.

Keywords: wavelet transform, canny edge

detector, sobel edge detector, noise.

INTRODUCTION

An edge in an image is a contour

across which the brightness of the image

changes abruptly. In image processing, an

edge is often interpreted as one class of

singularities. In a function, singularities can be

characterized easily as discontinuities where

the gradient approaches infinity. However,

image data is discrete, so edges in an image

often are defined as the local maxima of the

gradient. This is the definition we will use here.

Edge detection is an important task in image

processing. It is a main tool in pattern

recognition, image segmentation, and scene

analysis. An edge detector is basically a high

pass filter that can be applied to extract the

edge points in an image. This topic has

attracted many researchers and many

achievements have been made [14]-[20]. In

this paper, we will explain the mechanism of

edge detectors from the point of view of

wavelets and develop a way to construct edge

detection filters using wavelet transforms.

Many classical edge detectors have

been developed over time. They are based on

the principle of matching local image segments

with specific edge patterns. The edge

detection is realized by the convolution with a

set of directional derivative masks [21]. The

popular edge detection operators are Roberts,

Sobel, Prewitt, Frei-Chen, and Laplacian

operators ( [17], [18], [21], [22] ). They are all

defined on a 3 by 3 pattern grid, so they are

efficient and easy to apply. In certain situations

where the edges are highly directional, some

edge detector works especially well because

their patterns fit the edges better.

Noise and its influence on edge detection

However, classical edge detectors

usually fail to handle images with strong noise,


2011

SIP0403-2

as shown in Fig. 1.1. Noise is unpredictable

contamination on the original image. It is

usually introduced by the transmission or

compression of the image.

(a) Lena image (b) Edges using Canny

(c) Image with noise (d) Edges from the image with

noise

Fig. 1.1: Impact of noise on edge detection

There are various kinds of noise, but

the most widely studied two kinds are white

noise and “salt and pepper” noise. Fig. 1.1

shows the dramatic difference between the

result of edge detection from two similar

images, with the later one affected by some

white noise.

Review of Classical Edge Detectors

Classical edge detectors use a pre-

defined group of edge patterns to match each

image segments of a fixed size. 2-D discrete

convolutions are used here to find the

correlations between the pre-defined edge

patterns and the sampled image segment.

( * )( , ) ( , ) ( , ),i j

f m x y f i j m x i y j ..........(1.

1)

Where f is the image and m is the edge pattern

defined by

M (i, j) = 0, if (i, j) is not in the grid

These patterns are represented as filters,

which are vectors (1-D) or matrices (2-D). For

fast performance, usually the dimension of

these filters are 1×3 (1-D) or 3×3 (2-D). From

the point of view of functions, filters are

discrete operators of directional derivatives.

Instead of finding the local maxima of the

gradient, we set a threshold and consider

those points with gradient above the threshold

as edge points. Given the source image f(x,y),

the edge image E(x,y) is given by

(1.2)

Where s and t are two filters of different

directions.

Roberts edge detector

The edge patterns are shown in Fig.1.2

(a) (b)

Fig. 1.2: Edge patterns for Roberts edge detector:(a) s; (b)

t

These filters have the shortest

support, thus the position of the edges is more

accurate. On the other hand, the short support

of the filters makes it very vulnerable to noise.


2011

SIP0403-3

The edge pattern of this edge detector makes

it especially sensitive to edges with a slope

around π/4. Some computer vision programs

use the Roberts edge detector to recognize

edges of roads.

Prewitt edge detector

The edge patterns are shown in Fig. 1.3

(a) (b)

Fig. 1.3: Edge patterns for Prewitt and Sobel edge

detectors: (a)s; (b)t

These filters have longer support.

They differentiate in one direction and average

in the other direction. So the edge detector is

less vulnerable to noise. However, the position

of the edges might be altered due to the

average operation.

Sobel edge detector

The edge patterns are similar to those

of the Prewitt edge detector as shown in Fig.

1.3. These filters are similar to the Prewitt

edge detector, but the average operator is

more like a Gaussian, which makes it better for

removing some white noise.

Frei-Chen edge detector

A 3×3 sub image b of an image f may

be thought of as a vector in R9. For example,

Let V denote the vector space of 3 × 3

sub images. Bv, an orthogonal basis for V, is

used for the Frei-Chen method. The subspace

E of V that is spanned by the sub images v1,

v2, v3, and v4 is called the edge subspace of

V. The Frei-Chen edge detection method

bases its determination of edge points on the

size of the angle between the sub image b and

its projection on the edge subspace.

(1.3)

The edge patterns are shown in fig. 1.4

(g) (h) (i)


2011

SIP0403-4

Fig. 1.4: Edge Patterns for the Frei-Chen edge

detector: (a) v1; (b) v2; (c) v3; (d) v4; (e) v5;

(f) v6; (g) v7; (h) v8; (i) v9.

As shown in above patterns, the sub

images in the edge space are typical edge

patterns with different directions; the other sub

images resemble lines and blank space.

Therefore, the angle θE is small when the sub

image contains edge-like elements, and θE is

large otherwise.

Canny edge detection

Canny edge detection [4] is an

important step towards mathematically solving

edge detection problems. This edge detection

method is optimal for step edges corrupted by

white noise. Canny used three criteria to

design his edge detector. The first requirement

is reliable detection of edges with low

probability of missing true edges, and a low

probability of detecting false edges. Second,

the detected edges should be close to the true

location of the edge. Lastly, there should be

only one response to a single edge. To

quantify these criteria, the following functions

are defined:

Where A is the amplitude of the signal and

(n0)2 is the variance of noise. SNR(f) defines

the signal-to-noise ratio and Loc(f) defines the

localization of the filter f(x). Now, by scaling f

to fs, we get the following “uncertainty

principle”:

That is, increasing the filter size increases the

signal-to-noise ratio but also decreases the

localization by the same factor. This suggests

maximizing the product of the two. So the

object function is defined as:

(1.4)

Where f(x) is the filter for edge detection. The

optimal filter that is derived from these

requirements can be approximated with the

first derivative of the Gaussian filter,

The choice of the standard deviation for the

Gaussian filter, σ, depends on the size, or

scale, of the objects contained in the image.

For images with multiple size objects, or

unknown size one approach is to use Canny

detectors with different σ values. The outputs

of the different Canny filters are combined to

form the final edge image.

Development of wavelet analysis

The concept of wavelet analysis has been

developed since the late 1980’s. However, its idea

can be traced back to the Littlewood-Paley

technique and Calderón-Zygmund theory [25] in

harmonic analysis. Wavelet analysis is a powerful

tool for time-frequency analysis. Fourier analysis is

also a good tool for frequency analysis, but it can

only provide global frequency information, which

is independent of time. Hence, with Fourier

analysis, it is impossible to describe the local

properties of functions in terms of their spectral

properties, which can be viewed as an expression

of the Heisenberg uncertainty principle [13].


2011

SIP0403-5

In many applied areas like digital signal processing,

time-frequency analysis is critical. That is, we want

to know the frequency properties of a function in a

local time interval. Engineers and mathematicians

developed analytic methods that were adapted to

these problems, therefore avoiding the inherent

difficulties in classical Fourier analysis. For this

purpose, Dennis Gabor introduced a “sliding-

window” technique. He used a Gaussian function g

as a “window” function, and then calculated the

Fourier transform of a function in the “sliding

window”. The analyzing function is

The Gabor transform is useful for time-frequency

analysis. The Gabor transform was later

generalized to the windowed Fourier transform in

which g is replaced by a “time local” function

called the “window” function. However, this

analyzing function has the disadvantage that the

spatial resolution is limited by the fixed size of the

Gaussian envelope [13]. In 1985, Yves Meyer

([23], [24]) discovered that one could obtain

orthonormal bases for L2(R) of the type

and that the expression

for decomposing a function into these orthonormal

wavelets converged in many function spaces.

Themost preeminent books on wavelets are those

ofMeyer ([23], [24]) and Daubechies. Meyer

focuses on mathematical applications of wavelet

theory in harmonic analysis; Daubechies gives a

thorough presentation of techniques for

constructing wavelet bases with desired properties,

along with a variety of methods for mathematical

signal analysis [14]. A particular example of an

orthonormal wavelet system was introduced by

Alfred Haar. However, the Haar wavelets are

discontinuous and therefore poorly localized in

frequency. Stéphane Mallat made a decisive step in

the theory of wavelets in 1987 when he proposed a

fast algorithm for the computation of wavelet

coefficients. He proposed the pyramidal schemes

that decompose signals into subbands. These

techniques can be traced back to the 1970s when

they were developed to reduce quantization noise.

The framework that unifies these algorithms and

the theory of wavelets is the concept of a multi-

resolution analysis (MRA). AnMRA is an

increasing sequence of closed, nested subspaces

{Vj}j∈Z that tends to L2(R) as j increases. Vj is

obtained from Vj+1 by a dilation of factor 2. V0 is

spanned by a function φ that satisfies

(1.6)

Equation (1.6) is called the “two-scale equation”,

and it plays an essential role in the theory of

wavelet bases.

Edge detector using wavelets

Now that we have talked briefly about the

development of edge detection techniques and

wavelet theories, we next discuss how they are

related. Edges in images can be mathematically

defined as local singularities. Until recently, the

Fourier transforms was the main mathematical tool

for analyzing singularities. However, the Fourier

transform is global and not well adapted to local

singularities. It is hard to find the location and

spatial distribution of singularities with Fourier

transforms. Wavelet analysis is a local analysis, it

is especially suitable for time-frequency analysis

[1], which is essential for singularity detection.

This was a major motivation for the study of the

wavelet transform in mathematics and in applied

domains. With the growth of wavelet theory, the

wavelet transforms have been found to be

remarkable mathematical tools to analyze the

singularities including the edges, and further, to

detect them effectively. This idea is similar to that

of John Canny [4]. The Canny approach selects a

Gaussian function as a smoothing function θ; while

the wavelet-based approach chooses a wavelet

function to be θ0. Mallat, Hwang, and Zhong ( [5],

[6] ) proved that the maxima of the wavelet

transform modulus can detect the location of the

irregular structures. Further, a numerical procedure

to calculate their Lipschitz exponents has been

provided. One and two-dimensional signals can be

reconstructed, with a good approximation, from the

local maxima of their wavelet transform modulus.

The wavelet transform characterizes the local

regularity of signals by decomposing signals into

elementary building blocks that arewell localized

both in space and frequency. This not only explains

the underlying mechanism of classical edge

detectors, but also indicates a way of constructing

optimal edge detectors under specific working

conditions.

Results:

Multiscale edge detection


2011

SIP0403-6

Wavelet filters of large scales are

more effective for removing noise, but at the

same time increase the uncertainty of the

location of edges. Wavelet filters of small

scales preserve the exact location of edges,

but cannot distinguish between noise and real

edges. We can use the coefficients of the

wavelet transform across scales to measure

the local Lipschitz regularity. That is, when the

scale increases, the coefficients of the wavelet

transformare likely to increase where the

Lipschitz regularity is positive, but they are

likely to decrease where the Lipschitz

regularity is negative. We know that locations

with lower Lipschitz regularities are more likely

to be details and noise. As scale increases,

the coefficients of the wavelet transform

increase for step edges, but decrease for Dirac

and fractal edges. So we can use a larger-

scale wavelet at positions where the wavelet

transform decreases rapidly across scales to

remove the effect of noise, while using a

smaller-scale wavelet at positions where the

wavelet transform decreases slowly across

scale to preserve the precise position of the

edges. Using the cascade algorithm in, we can

observe the change of the wavelet transform

coefficient between each adjacent scales, and

distinguish different kind of edges. Then we

can keep the scales small for locations with

positive Lipschitz regularity and increase the

scales for locations with negative Lipschitz

regularity. Fig. 1.5 shows that for a image

without noise, the result of our method is

similar to that of Canny’s edge detection. For

images with white noise in Fig. 1.6 – 1.10, our

method gives more continuous and precise

edges. Table 1 shows that the SNR of the

edges obtained by the multiscale wavelet

transform is significantly higher than others.

(a) (b) (c)

Fig. 1.5: Edge detection for Lena image: (a) The

Lena image; (b) Edges by the Canny edge detector;

(c) Edges by the multiscale edge detection using

wavelet transform

(a)

(b) (c)

(d) (e)

Fig. 1.6: Edge detection for a block image with

noise: (a) A block image (SNR=10db); (b) Edges by

the Sobel edge detector; (c) Edges by Canny edge


2011

SIP0403-7

detection with default variance; (d) edges by Canny

edge detection with adjusted variance; (e) Edges by

the multiscale edge detection using wavelet

transform

(a) (b)

(c) (d)

Fig. 1.7: Edge detection for a Lena image with

noise: (a) Lena image (SNR=30db); (b) Edges by


detection with adjusted variance; (d) Edges by

multi-level edge detection using wavelets

Table 1: False rate of the detected edges

(a) (b)

(c) (d)

Fig. 1.8: Edge detection for a bridge image with

noise: (a) Bridge image (SNR=30db); (b) Edges by



multi-level edge detection using wavelet

(a) (b)

(c) (d)

Fig. 1.9: Edge detection for a pepper image with

noise: (a) Pepper image (SNR=10db); (b) Edges by




(a) (b)


2011

SIP0403-8

(c) (d)

Fig. 1.10: Edge detection for a wheel image with

noise: (a) Wheel image (SNR=10db); (b) Edges by




Conclusion & Future scope

In this work we have described an

approach for edge detection using wavelet

transform. The wavelet edge detector produces

better edges over classical edge detectors. Classical

edge detectors are very sensitive to noise. Since

wavelet decomposition involves low-pass filter, the

amount of the noise can be decreased in image

which in turn could lead to robust edge detection.

We can use the wavelet transformer to produce

initial images, then watershed algorithm can be

used for segmentation of the initial image, then by

using the inverse wavelet transform, the segmented

image can be projected up to a higher resolution.

REFERENCES

[1] J. C. Goswami, A. K. Chan, 1999,

“Fundamentals of wavelets: theory, algorithms, and

applications,” John Wiley & Sons, Inc.

[2] Y. Y. Tang, L. Yang, J. Liu, 2000, “Characterization

of Dirac-Structure Edges with Wavelet Transform,”

IEEE Trans. Sys. Man Cybernetics-Part B: Cybernetics,

vol.30, no.1, pp. 93-109.

[3] Mallat, S. 1987. “A compact multiresolution

representation: the wavelet model.” Proc. IEEE

Computer Society Workshop on Computer Vision, IEEE

Computer Society Press, Washington, D.C., p.2-7.

[4] J. Canny, 1986, “A computational approach to

edge detection,” IEEE Trans. Pattern Anal. Machine

Intell., vol. PAMI-8, pp. 679-698.

[5] S. Mallat, S. Zhong, 1992, “Characterization of

signals from multiscale edges,” IEEE Trans. Pattern

Anal. Machine Intell., vol.14, no.7, pp. 710-732.

[6]Acharyya, M., Kundu, M.K., 2001. Wavelet-based

texture segmentation of remotely sensed images. IEEE

11th Internat. Conf. Image Anal. Process, 69–74.

[7] Xiao, D., Ohya, J., “CONTRAST ENHANCEMENT

OF COLOR IMAGES BASED ON WAVELET

TRANSFORM AND HUMAN VISUAL SYSTEM”,

international conference GRAOPHICS AND

VISUALIZATIO IN ENGINEERING, Florida, USA,

2007.

[8] Scharcanski, J., Jung, C., R., Clarke, R. T., “Adaptive

Image Denoising Using Scale and Space Consistency”,

IEEE TRANSACTIONS ON IMAGE PROCESSING,

VOL. 11, NO. 9, SEPTEMBER 2002.

[9] Mallat S. A.: Theory for Multiresolution Signal

Decomposition: The Wavelet Representation. IEEE

Transactions on Pattern Analysis and Machine

Intelligence, 11(7), 674–693.

[10] Prieto M. S., Allen A. R.: A Similarity Metric for

Edge Images. IEEE Transactions on Pattern Analysis and

Machine Intelligence, 25(10), 1265–1273.

[11] Drori I., Lischinski D.: Fast Multiresolution Image

Operations in the Wavelet Domain. IEEE Transactions

on Visualization and Computer Graphics, 9(3), 395–411.

[12] I. Drori D. Lischinski. Fast multiresolution image

operations in the wavelet domain. IEEE Transactions on

Visualization and Computer Graphics., 9(3):395–412,

2003.

[13] A. Cohen, R. D. Ryan, 1995, “ Wavelets

andMultiscale Signal Processing,” Chapman & Hall.

[14] J. J. Benedetto,M.W. Frazier, 1994, “Wavelets-

Mathematics and Applications,” CRC Press, Inc.

[15] R. J. Beattie, 1984, “Edge detection for semantically

based early visual processing,” dissertation, Univ.

Edinburgh, Edinburgh, U.K..

[16] B. K. P. Horn, 1971, “The Binford-Horn line-

finder,” Artificial Intell. Lab., Mass. Inst. Technol.,

Cambridge, AI Memo 285.

[17] L. Mero, 1975, “A simplified and fast version of the

Hueckel operator for finding optimal edges in pictures,”

Pric. IJCAI, pp. 650-655.

[18] R. Nevatia, 1977, “Evaluation of simplified Hueckel

edge-line detector,” Comput., Graph., Image Process.,

vol. 6, no. 6, pp. 582-588.

[19] Y. Y. Tang, L.H. Yang, L. Feng, 1998,

“Characterization and detection of edges by Lipschitz

exponent and MASW wavelet transform,” Proc. 14th Int.

Conf. Pattern Recognit., Brisbane, Australia, pp. 1572-

1574.

[20] K. A. Stevens, 1980, “Surface perception from local

analysis of texture and contour,” Artificial Intell. Lab.,

Mass. Instr. Technol., Cambridge, Tech. Rep. AI-TR-

512.

[21] K. R. Castleman, 1996, “Digital Image Processing,”

Englewood Cliffs, NJ: Prentice- Hall.


2011

SIP0403-9

[22] M. Hueckel, 1971, “An operator which locates

edges in digital pictures,” J. ACM, vol. 18, no. 1, pp.

113-125.

[23] Acharyya, M., Kundu, M.K., 2001. Wavelet-based

texture segmentation of remotely sensed images. IEEE

11th Internat. Conf. Image Anal. Process., 69–74.

[24] Jahromi O. S., Francis B. A., Kwong R. H.:

Algebraic theory of optimal filterbanks. Proceedings of

IEEE International Conference on Acoustics, Speech and

Signal Processing, 1, 113–116.

[25] A. Zygmund, 1968, “Trigonometric Series,” 2nd

ed., Cambridge: Cambridge Univ. Press.

[26] Mallat S.: Multifrequency channel decompositions

of images and wavelet models. IEEE Transaction in

Acoustic Speech and Signal Processing, 37, 2091–2110.

[27] R. M. Haralick, 1984, “Digital step edges from zero

crossing of second directional derivatives,” IEEE Trans.

Pattern Anal. Machine Intell., vol. PAMI-6, no. 1, pp.

58-68.

[28] E. C. Hildreth, 1980, “Implementation of a theory of

edge detection,” M.I.T. Artificial Intell. Lab., Cambridge,

MA, Tech. Rep. AI-TR-579.


SIP0404-1

Color Image Enhancement by Scaling Luminance and Chromatic

Components

1Satyabrata Das,

2Sukanti Pal and

3A K Panda

National Institute of Science and Technology, Palur Hills, Berhampur, Odisha, 761008

Email: [email protected],

[email protected],

[email protected]

ABSTRACT

In this paper, a new technique for color image

enhancement using luminance and chromatic

component is presented. In the proposed

technique luminance and chromatic components

of color image are extracted separately and

converted to frequency domain. Then DC and

AC coefficients are scaled to preserve contrast,

brightness and color. While enhancing the

image care is taken to reduce the mathematical

computations. Processing the color image in

DCT domain invites unwanted side effect such

as blocking artifact which is minimized by using

smaller sub block matrix keeping in view the

complexity of mathematical computation.

Keywords Blocking artifacts, Chromatic, DCT, Luminance

1. INTRODUCTION

The display of a color image depends mainly on

brightness, contrast and colors. Enhancement of

the image is necessary to improve the visibility

of the image subjectively to remove unwanted

flickering, to improve contrast and to find more

details. In general there are two major

approaches [1]. They are spatial domain, where

statistics of grey values of the image are

manipulated and the second is frequency domain

approach; where spatial frequency contents of

the image are manipulated [1]. In spatial domain

histogram equalization, principal component

analysis, rank order filtering, homomorphic

filtering etc are generally used to enhance the

image. Although these techniques are developed

for gray valued images but few of them are also

applied to color image for enhancement

purpose. Mostly images are represented in

compressed format to save memory space and

bandwidth. So it is better if enhancement of the

image can be achieved in compressed domain

rather than transforming to spatial domain and

applying the enhancement technique and

transforming back to compressed domain;

thereby increasing the computational overhead.

Therefore, color images mostly uses JPEG

compression format for saving bandwidth and

memory space which uses popular discrete

cosine transform (DCT). Extracting the

luminance and chromatic components in DCT

domain and processing them to improve the

brightness, contrast and color invites unwanted

side effect such as blocking artifact. However,

these side effects can be minimized by using

special mathematical computation techniques.

In our work we have represented the color

image using Y-Cb-Cr color space so that we can

preserve both luminance and color component.

Previous works [2-4] have used the DCT

domain and they have implemented non uniform

scaling of DC and AC coefficients which

requires more mathematical computation. In our

approach we have adopted a uniform scale value

for both DC and AC components of Y, Cb and

Cr which substantially lowers computational

burden and at the same time enhancing the

image. DCT-II is presented in section 2. The

proposed algorithm is presented in section 3.

The results obtained are presented in section 4

and the paper is concluded in section 5.

2. MATHEMATICAL

PRELIMINARIES

There are eight different ways to do the even

extension of DFT and there are as many

definitions of the DCT [5,6]. We have used type


SIP0404-2

II DCT, which is widely, used in practice for

speech and image compression applications as

part of various standards [7]. Equation (1)

represents two dimensional DCT where C(k,l)

represents transformed DCT coefficients for the

input image x(m,n) assuming a square image of

size (N×N).

1,0

2

)12(cos

2

)12(cos),()()(),(

1

0

1

0

Nlk

N

ln

N

kmnmxlklkC

N

m

N

n

(1)

where

1,1

210

Nlkfor

Nlkand

N

Contrast of an image is defined using as change

in luminance with respect to surrounding to

luminance of surround. Hence contrast can be

thought of as the ratio between standard

deviation (σ) to mean (µ) value of the image.

The greater the value of standard deviation more

is the contrast.

3. THE PROPOSED ALGORITHM

Image in RGB format space is converted into Y-

Cb-Cr color space to find out luminance and

chromatic component individually. Then Y, Cb,

Cr component is split into (8×8) sub blocks

respectively. Then for each sub block DCT-II is

computed separately to obtain Y(u,v), Cb(u,v)

and Cr(u,v) respectively, where Y(u,v), Cb(u,v)

and Cr(u,v) represents the block transformed

DCT coefficients and the first element of each

DCT transformed coefficient Y(0,0), Cb(0,0)

and Cr(0,0) represents DC component and rest

are AC component. Each sub block after

computing its DCT coefficient is normalized by

a factor of 8. The proposed algorithm is

implemented in four steps. In first step

adjustment of local brightness is achieved. Local

brightness is adjusted by mapping the DC

coefficients of each sub block of Y(u,v) using a

monotonically increasing function ψ(x) [8]

which is shown in fig.1. While mapping the

coefficients, DC coefficient is treated separately

as compared to rest of the AC coefficients.

Mapping function for DC coefficient is

maxmax

80,0

Y

Y

Yy DCmapped (2)

where

0,;10

1,1

1

011

21

2

1

ppnmand

xmm

mxnn

mxm

xn

xp

p

ymax is the maximum brightness value of the

image before transforming using DCT. There

are various monotonic increasing functions

available in the literature [4] and [7]. No single

function is best suitable for all the images for

enhancement purpose. We choose ψ(x) as its

value can be modified using four parameters

such as m, n, p1, p2. We varied the values for m,

n, p1, p2 and choose m = n = 0.5 and p1=1.8 and

p2= 0.8 for best performance. As Y component

represents the luminance component hence only

this component is mapped to alter its brightness

leaving behind the Cb and Cr component

unaltered. In the second step adjustment of local

contrast is achieved by scaling the DC and AC

coefficients of normalized Y(u,v), Cb(u,v) and

Cr(u,v). The scale factor „s‟ is defined as the

ratio between mapped DC coefficient for each


SIP0404-3

normalized sub block (8×8) of Y(u,v) to the

original DC coefficient. As DC component

gives the information about mean of brightness

distribution of each sub block hence it is used to

compute the scale factor„s‟. Assuming 8 bit

representation while scaling overflow of gray

values may occur beyond 255 which is taken

care by limiting the scale factor depending upon

the image. In the third step preservation of color

is achieved through scaling of normalized

Cb(u,v) and Cr(u,v) component through the

same scale factor „s‟ corresponding to each

normalized sub block of Y(u,v). Since the

mapping from RGB to Y-Cb-Cr is non linear

and Cb, Cr depends on Y hence while scaling

the color component DC coefficients has to be

treated separately.

Similarly for normalized Cr(u,v) is to be scaled

using the above mentioned procedure. Finally

blocking artifacts are suppressed. As this

algorithm is developed around type–II DCT

hence blocking artifacts are visible in the

processed image because of discontinuities in

gray values. There are several methods available

to minimize the blocking artifacts but they are

computationally exhaustive. We have proposed

a simple method to minimize blocking artifacts

and at the same time it requires less

computation. For this purpose standard

deviation (σ) is computed for each normalized

sub block of Y(u,v). When (σ) represents a large

value then it is concluded that corresponding

sub block contains a large variation of gray

values which results in blocking artifacts. If

threshold where

threshold represents threshold

value of standard deviation which is image

dependent and to be decided based on the

amount of blocking artifact removal. Each

normalized 8×8 sub block of Y(u,v), Cb(u,v)

and Cr(u,v) are subdivided into four 4×4 sub

blocks. and the scale factor „s‟ is recomputed

through the earlier mentioned steps of this

algorithm. Only those sub blocks will be scaled

where threshold

condition is met leaving

behind the remaining sub blocks unaltered.

Then corresponding sub blocks of Y, Cb and Cr

is scaled through the new scale factor in order to

remove the artifacts. Finally image is

reconstructed in spatial domain by combing Y,

Cb and Cr components.

4. QUALITY ASSESMENT

Simulation is performed on various images

using MATLAB. As the proposed algorithm is

based on DCT so for assessing quality PSNR

and SNR is not a suitable option as prior

information regarding the type of distortion is

not available with us. We have used no-

reference perceptual quality assessment for

JPEG compressed images [9] where quality

metric that incorporates human visual system

characteristics which do not require the input

image for computing the quality. Based upon

this a quality score is obtained which reflects the

amount of blocking artifact removal and

distortion removal due to non linear mapping. If

the quality score is nearer to 10 it reflects the

best quality image and 1 represents worst

quality image. Wang et al. [9] suggested no

reference quality metric for computing the

quality of JPEG image. The computation of this

metric is described in [9] where they have cited

the website which contains the MATLAB code

for computing the quality score. We have used

the same MATLAB code for evaluation of

quality and called as quality score. Quality score

obtained for different images is tabulated in

table 1.

Table 1. Quality Score


SIP0404-4

Before

artifact

removal

After

artifact

removal

Image_1 7.7274 8.3612

Image_2 6.177 8.36

Image_3 8.3903 9.3895

Image_4 8.6128 8.9288

Figure 2.a represents the original image for

Image_1. For Image_1 four stages of output

obtained; they are (i) after scaling the DC

coefficient of Y (fig 2.b), (ii) after scaling both

DC and AC coefficients of Y (fig 2.c),(iii)

scaling (Y, Cb and Cr) components before

blocking artifact removal (fig 2.d) and (iv) after

blocking artifact removal (fig 2.e) respectively.

For Image_2,3,4 outputs before and after

blocking artifact removal are shown in same

way in figure 3, 4 and 5. Quality factor is

computed for different images and is shown in

table 1. From table, it is observed that quality

factor is improved after removing the blocking

artifact. Table shows the quality factor is nearer

to ten showing the better enhancement of color

image.

(a) (b)

(c) (d)

(e)

Fig 2 (a) Image_1 (b) enhanced image by

scaling DC coefficient only (c) enhanced image

by scaling both DC and AC coefficient (d)

enhanced image by scaling all components

including Cb and Cr (e) enhanced image with

blocking artifacts removal.

Fig 1: Plot of mapping function ψ(x)

(a) (b) (c)

Fig 3.(a) Image_2 (b) enhanced image by

scaling all components including Cb and Cr (c)

enhanced image with blocking artifacts removal.


SIP0404-5

(a) (b)

(c)

Fig 4.(a) Image_3 (b) enhanced image by

scaling all components including Cb and Cr (c)

enhanced image with blocking artifacts removal

(a) (b)

(c)

Fig 5.(a)Image_4 (b) enhanced image by scaling

all components including Cb and Cr (c)

enhanced image with blocking artifacts removal

CONCLUSION

In this paper, we have presented a simple

method for enhancing the color image in

compressed format by scaling luminance and

chromatic components using less computational

overhead. Quality score is computed which

proves the performance of proposed method.

The proposed algorithm can be implemented on

any image processing hardware.

ACKNOWLEDGMENT

The authors acknowledge the DST-TIFAC

CORE on “3G/4G Communication

Technologies” received by National Institute of

Science and Technology from Department of

Science & Technology (DST), Government of

India.

REFERENCES

[1] Gonzalez, Rafael C. and Woods, Richard E.

Digital Image Processing, Pearson, Prentice

Hall, Third edition, 2008.

[2] Aghagolzadeh, S. and Erosy, O. K.

“Transform image enhancement,” Opt. Eng.,

vol.31, pp.614-626, Mar.1992.

[3] Tang, J., Peli, E., and Acton, S. “Image

enhancement using a contrast measure in the

compressed domain,” IEEE Signal Process.

Lett. Vol.10, pp.289-292, Oct. 2003.

[4] Lee, S. “An efficient content – based image

enhancement in the compressed domain

using retinex theory,” IEEE Trans. Circuits

Syst. Video Technol., vol. 17,no. 2, pp. 199-

213, feb.2007.

[5] Wang, Z. “Fast algorithms for the discrete w

transform for the discrete fourier transform,”

IEEE Trans. On ASSP, vol. 32. No. 4. pp.

803-816, Aug. 1984.

[6] Martucci, S.A. “Symmetric convolution and

the discrete sine and cosine transforms.”

IEEE Trans. On signal Processing, vol.42,

no. 5, pp.1038-4051, May. 1994.

[7] Rao, K. and Huang, J. “Techniques and

standards for image, video, and audio

coding,” Prentice Hall, Upper Saddle River,

NJ. 1996.

[8] De, T.K. “A simple programmable S-

function for digital image processing,” in

Proc. 4th

IEEE Region 10th

Int. Conf.,

Bombay, India, pp. 573-576.Nov.1989.

[9] Wang, Z., Sheikh, H.R. and Bovik, A.C.

“No-reference perceptual quality assessment


SIP0404-6

of JPEG compressed images,” in Proc. Int.

Conf. Image Processing, Rochester, NY,

vol. 1. pp. 477-480, Sep. 2002.


SIP0405-1

A Tutorial on Image Compression Techniques

1Vedvrat,

2Krishna Raj

1Department of Electronics & Communication Engineering A.I.T, Kanpur, U.P., India

2Department of Electronics Engineering H.B.T.I., Kanpur, U.P., India

(Email: [email protected], [email protected])

Abstract—Processing of multimedia data acquires large

transmission bandwidth and storage capacity. Reduction in

these parameters introduces the concept of data compression.

For achieving the better compression without degrading the

image quality, data compression techniques become the

challenge for the researchers. Numerous image coding

techniques i.e. subband coding, EZW, SPIHT, EBCOT,

wavelet transform coding have been presented. In this paper

performance comparison of these coding techniques is

presented.

Keywords—Wavelet transform, EBCOT, SPIHT, EZW,

subband coding, JPEG

I. INTRODUCTION

Uncompressed multimedia (audio and video) data

requires considerable storage capacity and transmission

bandwidth. Despite rapid progress in mass-storage density,

processor speeds, and digital communication system

performance, demand for data storage capacity and data-

transmission bandwidth continues to outstrip the

capabilities of available technologies. The recent growth of

data intensive multimedia-based web applications have not

only sustained the need for more efficient ways to encode

signals and images but have made compression of such

signals central to storage and communication technology.

For still image compression, the `Joint Photographic

Experts Group' or JPEG standard has been established by

ISO (International Standards Organization) and IEC

(International Electro-Technical Commission). The

performance of these coders generally degrades at low bit-

rates mainly because of the underlying block-based

Discrete Cosine Transform (DCT) scheme. More recently,

the wavelet transform has emerged as a cutting edge

technology, within the field of image compression.

Wavelet-based coding provides substantial improvements

in picture quality at higher compression ratios. The large

storage space, large transmission bandwidth, and long

transmission time is required for image, audio, and video

data. At the present state of technology, the only solution

is to compress multimedia data before its storage and

transmission, and decompress it at the receiver for play

back.

II. COMPRESSION PRINCIPLE

Existing correlation in neighboring pixels causes the

redundant information in images. So less correlated

representation of image required. Two fundamental

components of compression are redundancy and

irrelevancy reduction. Redundancy reduction aims at

removing duplication from the signal source

(image/video). Irrelevancy reduction omits parts of the

signal that will not be noticed by the signal receiver,

namely the Human Visual System (HVS). In general, three

types of redundancy can be identified. Image compression

research aims at reducing the number of bits needed to

represent an image by removing the spatial and spectral

redundancies as much as possible.

a. Spatial Redundancy; correlation between

neighboring pixel values.

b. Spectral Redundancy; correlation between

different color planes or spectral bands.

c. Temporal Redundancy; correlation between

adjacent frames in a sequence of images (in video

applications).

In lossless compression schemes, the reconstructed

image, after compression, is numerically identical to the

original image. An image reconstructed following lossy

compression contains degradation relative to the original.

Often this is because the compression scheme completely

discards redundant information. However, lossy schemes

are capable of achieving much higher compression. Under

normal viewing conditions, no visible loss is perceived. In

predictive coding, information already sent or available is

used to predict future values, and the difference is coded.

Since this is done in the image or spatial domain, it is

relatively simple to implement and is readily adapted to

local image characteristics. Transform coding, on the other

hand, first transforms the image from its spatial domain

representation to a different type of representation using

some well-known transform and then codes the

transformed values. This method provides greater data

compression compared to predictive methods, although at

the expense of greater computation.

III. COMPRESSION TECHNIQUES

a. Subband Coding


SIP0405-2

In subband coding [4], an image is decomposed into

asset of band-limited components, called subbands, which

can be resembled to reconstruct the original image without

error. Each subband is generated by band pass filtering the

input. Since the bandwidth of the resulting subbands is

smaller than that of the original image, the subbands can

be downsampled without loss of information.

Reconstruction of the original image is accomplished by

upsampling, filtering, and summing the individual

subbands. Fig.1 shows the principal components of a two-

band subband coding and decoding system. The input of

the system is a 1-D, band-limited discrete-time signal x(n)

for n= 0,1,2....; the output sequence x‟(n) is formed

through the decomposition of x(n) into y0(n) and y1(n) via

analysis filters g0(n) and g1(n). Filter h0(n) is a low pass

filter whose output is an approximation of x(n); filter h1(n)

is a high pass filter whose output is high frequency or

detail part of x(n). All the filters Are selected in such a

way so that the input can be reconstructed perfectly such

that x‟(n) = x(n).

Fig.1 Components of Subband coding

Woods and O'Neil used a separable combination of

one-dimensional Quadrature Mirror Filter banks (QMF) to

perform 4-band decomposition by the row-column

approach as shown in fig.2. The process can be iterated to

obtain higher band decomposition filter trees. At the

decoder, the subband signals are decoded, upsampled and

passed through a bank of synthesis filters and properly

summed up to yield the reconstructed image.

Fig.2 4-band decomposition by row-column approach

b. Short Time Fourier Transform

The Fourier Transform separates the waveform into a

sum of sinusoids of different frequencies and identifies

their respective amplitudes. Thus it gives us a frequency-

amplitude representation of signal. In STFT [6], non-

stationary signal is divided into small portions, which are

assumed to be stationary. This is done using a window

function of chosen width, which is shifted and multiplied

with the signal to obtain the small stationary signals. The

Fourier Transform is then applied to each of these portions

to obtain the STFT of the signal. The problem with STFT

goes back to the Heisenberg uncertainty principle which

states that it is impossible for one to obtain which

frequencies exist at which time instance, but, one can

obtain the frequency bands existing in a time interval. This

gives rise to the resolution issue where there is a trade-off

between the time resolution and frequency resolution. To

assume stationarity, the window is supposed to be narrow,

which results in a poor frequency resolution, i.e., it is

difficult to know the exact frequency components that

exist in the signal; only the band of frequencies that exist is

obtained. If the width of the window is increased,

frequency resolution improves but time resolution

becomes poor, i.e., it is difficult to know what frequencies

occur at which time intervals. Once the window function

is decided, the frequency and time resolutions are fixed for

all frequencies and all times.

c. Wavelet Transform

In contrast to STFT, which uses a single analysis

window, the Wavelet Transform [5] uses short windows at

high frequencies and long windows at low frequencies.

This results in multi-resolution analysis by which the

signal is analyzed with different resolutions at different

frequencies, i.e., both frequency resolution and time

resolution vary in the time-frequency plane without

violating the Heisenberg inequality. In Wavelet Transform,

as frequency increases, the time resolution increases;

likewise, as frequency decreases, the frequency resolution

increases. Thus, a certain high frequency component can

be located more accurately in time than a low frequency

component and a low frequency component can be located

more accurately in frequency compared to a high

frequency component.

Wavelet transform analyzes non-stationary signals as both

frequency and time information is needed.

Wavelets are functions defined over a finite interval

and having an average value of zero. The basic idea of the

wavelet transform is to represent any arbitrary function

x(t) as a superposition of a set of such wavelets or basis

functions. These basis functions are obtained from a single

prototype wavelet called the mother wavelet, by dilations

or contractions (scaling) and translations (shifts). The

Discrete Wavelet Transform of a finite length signal x(n)

ho(n)

h1(n)

2

2

2

2

g1(n)

go(n)

x(n) x‟(n)

Col L

Col L

Col H

Col H

Row

H

Row

L

input


SIP0405-3

having N components, for example, is expressed by an N x

N matrix. The generic form for a1-D wavelet transform is shown

in Fig.3. Here a signal is passed through a low pass and

high pass filter, h and g, respectively, then downsampled

by a factor of 2, constituting one level of transform.

Multiple levels or scales of the wavelet transform are made

by repeating the filtering and decimation on low pass

branch outputs only. The process is typically carried out

for a finite number of levels K, and the resulting

coefficients, di1 (n), i {1,....K} and dk0(n), and are called

wavelet coefficients.

Fig.3. Generic form of 1-D wavelet transforms

The 1-D wavelet transform can be extended to a 2-D

wavelet transform using separable wavelet filters. With

separable filters the 2-D transform can be computed by

applying a 1-D transform to all the rows of input, and then

repeating on all of the columns. Fig.4 shows an example of

three-level (k=3) 2-D wavelet expansion, where k

represents the highest level of the decomposition of the

wavelet transform.

Fig.4 Three-level 2-D wavelet expansion

d. Embedded Zero tree Wavelet (EZW) Compression

In octave-band wavelet decomposition each

coefficient in the high-pass bands of the wavelet transform

has four coefficients corresponding to its spatial position in

the octave band above in frequency. Because of this very

structure of the decomposition, encoding of coefficients

required to achieve better compression results. Lewis and

Knowles [5] in 1992 were the first to introduce a tree-like

data structure to represent the coefficients of the octave

decomposition. Later, in 1993 Shapiro [2] called this

structure zero tree of wavelet coefficients, and presented

his elegant algorithm for entropy encoding called

Embedded Zero tree Wavelet (EZW) algorithm. EZW

algorithm contains the following features

1. A discrete wavelet transforms which provides a

compact multiresolution representation of the

image.

2. Zero tree coding which provides a compact

multiresolution representation of significance

maps, which indicates the position of significant

coefficients. Zero trees allow the successful

prediction of insignificant coefficients across

scales to be efficiently represented as a part of

growing trees.

3. Successive Approximation which provides a

compact multiprecision representation of the

significant coefficients and facilitates the

embedding algorithm.

4. Adaptive multilevel arithmetic coding which

provides a fast and efficient method for entropy

coding string of symbols, and requires no pre-

stored tables.

5. The algorithm runs sequentially and stops

whenever a target bit rate is met.

A significant map defined as an indication of whether

a particular coefficient was zero or nonzero (i.e.,

significant) relative to a given quantization level. The

EZW algorithm [2] determined a very efficient way to

code significance maps not by coding the location of the

significant coefficients, but rather by coding the location of

the zeros. It was found experimentally that zeros could be

predicted very accurately across different scales in the

wavelet transform. Defining a wavelet coefficient as

insignificant with respect to a threshold T if |x | < T, the

EZW algorithm hypothesized that “if a wavelet coefficient

at a coarse scale is insignificant with respect to a given

threshold T, then all wavelet coefficients of the same

orientation in the same spatial location at finer scales are

likely to be insignificant with respect to T.” Recognizing

that coefficients of the same spatial location and frequency

orientation in the wavelet decomposition can be compactly

described using tree structures, the EZW called the set of

insignificant coefficients, or coefficients that are quantized

to zero using threshold T, zero-trees.

LL2 HL2

LH2 HH

2

HL1

LH1 HH

1

h

g

2

2

h

g

2

2

d10(n)

d11(n)

dk0(n)

dk1(n)


SIP0405-4

Fig.5 Tree structure of wavelet transform

Consider the tree structures on the wavelet transform

shown in Fig.5. In the wavelet decomposition, coefficients

that are spatially related across scale can be compactly

described using these tree structures. With the exception of

the low resolution approximation (LL1) and the highest

frequency bands (HL1, LH1, and HH1) each parent

coefficient at level i of the decomposition spatially

correlates to 4 (child) coefficients at level i -1of the

decomposition which are at the same frequency

orientation. For the LLk band, each parent coefficient

spatially correlates with 3 child coefficients, one each in

the HLk, LHk, and HHk bands. The standard definitions of

ancestors and descendants in the tree follow directly from

these parent- child relationships. A coefficient is part of a

zero-tree if it is zero and if all of its descendants are zero

with respect to the threshold T. It is also a zero-tree root if

is not part of another zero-tree starting at a coarser scale.

Zero-trees are very efficient for coding since by declaring

only one coefficient a zero-tree root, a large number of

descendant coefficients are automatically known to be

zero. The compact representation, coupled with the fact

that zero-trees occur frequently, especially at low bit rates,

make zero-trees efficient for coding position information.

EZW implements successive approximation

quantization through a multipass scanning of the wavelet

coefficients using successively decreasing thresholdsT0,

T1,T2 ,.... . The initial threshold is set to the value of T0 =

2[log

2 x

max], where xmax is the largest wavelet coefficient.

Each scan of wavelet coefficients is divided into two

passes: dominant and subordinate. The dominant pass

establishes a significance map of the coefficients relative

to the current threshold Ti. Thus, coefficients which are

significant on the first dominant pass are known to lie in

the interval [T0 ,2T0 ) , and can be represented with the

reconstruction value of (3T0/2). The dominant pass

essentially establishes the most significant bit of binary

representation of the wavelet coefficient, with the binary

weights being relative to the thresholds Ti.

e. Set Partitioning in Hierarchical Trees (SPIHT)

Said and Pearlman [3], offered an alternative

explanation of the principles of operation of the EZW

algorithm to better understand the reasons for its excellent

performance. According to them, partial ordering by

magnitude of the transformed coefficients with a set

partitioning sorting algorithm, ordered bit plane

transmission of refinement bits, and exploitation of self-

similarity of the image wavelet transform across different

scales of an image are the three key concepts in EZW. In

addition, they offer a new and more effective

implementation of the modified EZW algorithm based on

set partitioning in hierarchical trees, and call it the SPIHT

algorithm. They also present a scheme for progressive

transmission of the coefficient values that incorporates the

concepts of ordering the coefficients by magnitude and

transmitting the most significant bits first. SPIHT uses a

uniform scalar quantizer and claim that the ordering

information made this simple quantization method more

efficient than expected. An efficient way to code the

ordering information is also proposed. Results from the

SPIHT coding algorithm in most cases surpass those

obtained from EZQ algorithm.

f. Scalable Image Compression with EBCOT

This algorithm is based on independent Embedded

Block Coding with Optimized Truncation of the embedded

bit-streams (EBCOT). EBCOT algorithm [1] uses a

wavelet transform to generate the subband coefficients

which are then quantized and coded. Although the usual

dyadic wavelet decomposition is typical, other "packet"

decompositions are also supported and occasionally

preferable. Scalable compression refers to the generation

of a bit-stream which contains embedded subsets, each of

which represents an efficient compression of the original

image at a reduced resolution or increased distortion. A

key advantage of scalable compression is that the target

bit-rate or reconstruction resolution need not be known at

the time of compression. Another advantage of practical

significance is that the image need not be compressed

multiple times in order to achieve a target bit-rate, as is

common with the existing JPEG compression standard.

Rather than focusing on generating a single scalable bit-

stream to represent the entire image, EBCOT partitions

each subband into relatively small blocks of samples and

generates a separate highly scalable bit-stream to represent

each so-called code-block. The algorithm exhibits state-of-

the-art compression performance while producing a bit-

stream with an unprecedented feature set, including

resolution and SNR scalability together with a random

access property. The algorithm has modest complexity and

is extremely well suited to applications involving remote

browsing of large compressed images.

IV. PERFORMANCE COMPARISION


SIP0405-5

Fig.6 (a) PSNR results for LENA

Fig.6 (b) PSNR results for BARBARA

V. CONCLUSION

A number of coding techniques have been proposed

since the introduction of the EZW algorithm. A common

characteristic of these techniques is that they use the basic

ideas found in the EZW algorithm. The wavelet coders are

much closer to the EZW algorithm than to the subband

coding. SPIHT became very popular since it was able to

achieve equal or better performance than EZW without

having to use an arithmetic encoder. The reduction in

complexity from eliminating the arithmetic encoder is

significant. Another technique, EBCOT algorithm, has

been chosen as the basis of the JPEG 2000 standard. The

performance comparison of these techniques has been

discussed in the previous section. By comparing the EZW,

subband coding and other techniques, because of the

multiresolution property and its performance of the lossy

wavelet image coding technique have matured

significantly and provides a very strong basis for the new

JPEG 2000 coding standard.

VI. REFRENCES

[1] Taubman, D. „High Performance Scalable Image

Compression with EBCOT‟, IEEE Tran. IP, Mar. 1999

[2] Shapiro, J. M. „Embedded Image Coding Using Zerotrees of

Wavelet Coefficients‟, IEEE Trans. SP, vol. 41, no. 12, Dec.

1993, pp. 3445-3462.

[3] Said, A. and Pearlman, W. A. „A New, Fast and Efficient

Image Codec Based on Set Partitioning in Hierarchical

Trees‟, IEEE Trans. CSVT, vol. 6, no. 3, June 1996, pp. 243-

250,

[4] Woods, J. W. and O'Neil, S. D. „Subband Coding of Images‟

IEEE Trans. ASSP, vol. 34, no. 5, October 1986, pp. 1278-

128

[5] Lewis, A. S. and Knowles, G. „Image Compression Using

the 2-D Wavelet Transform‟, IEEE Trans. IP, vol. 1, no. 2,

April 1992, pp. 244-250.

[6] Gonzalez, R.C. and Woods, R.E., Digital Image Processing,

2nd edition, Pearson Education, 2004, pp. 409 – 510.

0

5

10

15

20

25

30

35

40

45

0.0625 0.125 0.25 0.5 1

PSNR (Lena)

SC

WT

EZW

SPIHT

EBCOT

0

5

10

15

20

25

30

35

40

0.0625 0.125 0.25 0.5 1

PSNR (Barbara)

EZW

SPIHT

EBCOT


SIP0406-1

Comparative Study of Lifting –based

Discrete Wavelet transform Architectures

Vidyadhar Gupta, Krishna Raj

Department of electronics engineering

Harcourt Butler Technological Institute, Kanpur

Abstract. In this paper, we provide comparative

study of different existing architecture for efficient

implementation of lifting based Discrete wavelet

Transform(DWT).The basic principal behind the

lifting based scheme is to decompose the finite

impulse response filters in wavelet transform into

a finite sequence of simple filtering steps.

Keywords Architecture, Discrete wavelet

transform, lifting

1. Introduction The Discrete Wavelet Transform

(DWT) has become a very versatile signal

processing tool over the last decade. It has been

effectively used in signal and image processing

application ever since Mallat [4] proposed the

multiresoluation representation of signals based on

wavelet decomposition. In fact lifting based DWT

is the basis of the new JPEG2000 image

compression standard which has been shown to

have superior performance compared to the current

JPEG standard [5]. The main feature of the lifting-

based DWT scheme is to break up the high-pass

and low-pass wavelet filters into a sequence of

upper and lower triangular matrices, and convert

the filter implementation into banded matrix

multiplications [6] .The popularity of lifting-based

DWT has triggered the development of several

architectures in recent years. These architectures

range from highly parallel architectures to

programmable, DSP-based architectures to folded

architectures. In this paper we present comparative

study of these architectures. We provide systematic

derivation of these architectures and compared on

the basis of hardware utilization and critical path

latency. The rest of the paper is organized as

follows. In Section2, we briefly explain

mathematical formulation and principles behind the

lifting scheme. In section 3, we present a number of

one dimensional lifting -based DWT architectures.

Specifically, we describe direct mapping of the

data dependency diagram of the lifting scheme in a

pipelined architecture, folded architecture, MAC

based programmable architecture, flipping

architecture. In section 4, we present a comparison

the hardware and critical path latency of all the

architecture. We conclude this paper in section 5.

architecture.

2. DWT and Lifting implementation

In traditional convolution (filtering) based

approach for computation of the forward DWT, the

input signal (s) is filtered separately by a low-pass

filter ( ) and a high-pass filter ( ). The two output

streams are then sub-sampled by simply dropping

the alternate output samples in each stream to

produce the low-pass ( ) and high-pass ( ) sub-

band outputs as shown in Fig1

The two filters ( ) form the analysis filter bank.

The original signal can be reconstructed by a

synthesis filter bank (h, g) starting from and

as shown in Fig1 Given a discrete signal s(n), the

output signals (n)and (n) in Fig1can be

computed as:

(n) = (i) s(2n-1),

(n) = (i) s(2n-1) (1)

where and are the length of the low-pass filter

( ) and a high-pass filter ( ) respectively. During

the inverse transform computation, both and

are first up-sampled by inserting zeros in between

two samples and then filtered by low-pass (h) and

high-pass (g) filters respectively. Then they are

added together to obtain the reconstructed signal

(s’) as shown in Fig1.

Fig1.Signal analysis and reconstruction in 1D

DWT.

g

↑2

↑2

↓2

↓2

s

s’


SIP0406-2

For multiresolution wavelet decomposition the low

pass sub-band ( ) is further decomposed in a

similar fashion in order to get the second-level of

decomposition, and the process repeated. The

inverse process follows similar multi-level

synthesis filtering in order to reconstruct the signal.

Since two dimensional wavelet filters are separable

functions, 2D DWT can be obtained by first

applying the 1D DWT row-wise (to produce L and

H sub-bands in each row) and then column-wise.

For the filter bank in fig1, the condition for perfect

reconstruction of a signal [3] are given by

( ) + ( ) =2

(2)

( ) + =0

Where is the Z-transform of the FIR Filter .

can be expressed as a Laurent polynomial of degree

p as

=

This can also be expressed using a polyphase

representation as

= ( ) + ( ) (3)

where contains the even coefficients and

contains the odd coefficients of the FIR filter h.

Similarly,

= ( ) + ( ),

= + , (4)

= ( ) +

Based on the above formulation, we can define the

polyphase matrices as

= ,

(5)

=

Often is called the dual of and for

perfect reconstruction, they are related as

=I, where I is the identity matrix.

Now the wavelet transform in terms of the

polyhpase matrix can be expressed as

=

=

For the forward DWT and inverse DWT

respectively. If the determinant of is unity, it

can be shown by applying Cramer’s rule that

= ,

= and hence

= ,

= .

When the determinant of is unity, the

synthesis filter pair ( ) and the analysis filter

pair ( ) are both complementary. When ( )

= ), the wavelet transform is called orthogonal,

otherwise it is biorthogonal. We can apply

Euclidean algorithm to factorize (z) into a finite

sequence of alternating upper and lower triangular

matrices as follows;

(z) =

where K is a constant and act as a scaling factor (so

is ), and (for 1 ≤ i ≤ m) are Laurent

polynomials of lower orders. Computation of the

upper triangular matrix is known as primal lifting

and this is referred to in the literature as lifting the

low-pass sub band with the help of the high-pass

subband .Similarly, computation of the lower

triangular matrix is called dual lifting, which is

lifting the high-pass sub band with the help of the

low-pass sub band Often these two basic lifting

steps are called update and predict as well. The

dual polyphase factorization which also consists of

predict and update steps can be represented in the

following form:

=

Hence the lifting based forward wavelet transform

essentially is to first apply the lazy wavelet on the

input stream (split into even and odd samples), then

alternately execute primal and dual lifting steps,

and finally scale the two output streams by and K

respectively, to produce low pass and high-pass sub

bands, as shown in fig 2(a).

(a) Forward transformation.

(b ) Inverse transforms.

Fig 2 Lifting based DWT&IDWT.

K

Split

merge

K


SIP0406-3

The inverse DWT can be derived by traversing

above steps in the reverse direction, first scaling the

low-pass and high-pass sub band inputs by K and

1/K respectively, and then applying the dual and

primal lifting steps after reversing the signs of

coefficients in and and finally the inverse

lazy transform by up-scaling the output before

merging them into a single reconstructed stream as

shown in Fig.2 (b)

3. Lifting Architecture for 1D DWT

The data dependencies in the lifting scheme can be

explained with the help of an example of DWT

filtering with four factors (or four lifting steps).

The four lifting steps correspond to four stages as

shown in Fig. 3. The intermediate results generated

in the first two stages for the first two lifting steps

are subsequently processed to produce the high-

pass (HP) outputs in the third stage, followed by

the low-pass (LP) outputs in the fourth stage. (9, 7)

filter is an example of a filter that requires four

lifting steps. For the DWT filters requiring only

two factors, such as the (5, 3) filter, the

intermediate two stages can simply be bypassed

3.1 Direct Mapped Architecture

A direct mapping of the data dependency diagram

into a pipelined architecture was proposed by Liu

et al. in [7] and described in Fig .4 the architecture

is designed with 8 adders (A1–A8), 4 multipliers

(M1–M4), 6 delay elements (D) and 8 pipeline

registers (R). There are two input lines to the

architecture: one that inputs even samples (s2i) and

the other one that inputs odd samples (s2i+1). There

are four pipeline stages in the architecture. In the

first pipeline stage, adder A1computes s2i + s2i+1and

adder A2 computes α (s2i+s2i-2)+s2i-1 The output of

A2 corresponds to the intermediate results

generated in the first stage of Fig3. The output of

adder A4 in the second pipeline stage corresponds

to the intermediate results generated in the second

stage of Fig.3. Continuing in this fashion, adder A6

in the third pipeline stage produces the high-pass

output samples, and adder A8 in the fourth pipeline

stage produces the low-pass output samples. For

lifting schemes that require only 2 lifting steps,

such as the(5,3) filter, the last two pipeline stages

need to be bypassed causing the hardware

utilization to be only 50% or less. Also, for a

single read port memory, the odd and even samples

are read serially in alternate clock cycles and

buffered. This slows down the overall pipelined

architecture by 50% as well.

3.2 Folded Architecture

The pipelined architecture in Fig.4 can be further

improved by carefully folding the last two pipeline

stages into the first two stages as shown in Fig.5.

The architecture proposed by Lian, et al. in [2]

consists of two pipeline stages, with three pipeline

registers, R1, R2 and R3. In the (9, 7) type filtering

operation, intermediate data (R3) generated after

the first two lifting steps (Phase 1) are folded back

to R1 (as shown in Fig.5) for computation of the

last two lifting steps (phase 2). The architecture can

be reconfigured so that computation of two phases

can be interleaved by selection of appropriate data

by the multiplexors. As a result, two delay registers

(D) are needed in each lifting step in order to

properly schedule the data in each phase. Based on

the phase of interleaved computation, the

coefficient for multiplier M1 is either α or γ, and

similarly the coefficient for multiplier M2 is β or δ

.The hardware utilization of this architecture is

always 100%. Note that for the (5, 3) type filter

operation, folding is not required.

3.3 MAC Based Programmable Architecture [3]

A programmable architecture that implements the

data dependencies represented in Fig.3 using four

MACs (Multiply and Accumulate) and nine

registers has been proposed by Chang et al. in [3].

The algorithm is executed in two phases as shown

in Fig. 6 The data-flow of the proposed architecture

can be explained in terms of the register allocation

of the nodes. The computation and allocation of the

registers in phase 1 are done in the following order.

R0 s2i-1 ; R2 s2i

R3 R0 + α (R1+R2);

R4 R1 +β (R5+R3);

R8 R5 + γ (R6+R4);

Output LP R6+δ (R7+R8);

Output HP R8

Similarly, the computation and register allocation

in phase 2 are done in the following order.

R0 s2i+1; R1 s2i+2;

R5 R0+ α (R2+R1);

R6 R2 + β (R3+R5);

Output LP R4 +γ (R8+R7);

Output HP R7

As a result, two samples are input per phase and

two samples (LP and HP) are output at the end of

every phase. For 2D DWT implementation, the

output samples are also stored into a temporary

buffer for filtering in the vertical dimension.

3.4 Flipping Architecture [1]

While conventional lifting-based architectures

require fewer arithmetic operations, they

sometimes have long critical paths. For instance,

the critical path of the lifting-based architecture for

the (9, 7) filter is 4Tm + 8Ta while that of the

convolution implementation is Tm + 4Ta.


SIP0406-4

s0 s1 s2 s3 s4 s5 s6 s7 s8

α α α α α α α α

β β β β β β β β

γ γ γ γ γ γ γ γ 1/K

δ δ δ δ δ δ δ δ

LP

Fig.3 Data dependency diagram for lifting of filters with four factors

LP

s2i

s2i-2+s2

β δ

α γ

HP

s2i+1

s2i-1 α (s2i+s2i-2)+s2i-1

Fig.4 The direct mapped architecture [7]

Input

β, δ

K

1/K

α, γ

Fig. 5 The folded architecture in [2]

D

A1

D

R4

R4

R3

R3

A7

M4

A8 R1

D

D

D

M1

A3

R1 R2

R2

M2

A4

A2 A6

M3

A5

D

A3

A4

M2

M3

M4

R3 R

R

R2 R3 A2 Odd R1

D D

R1

A1

M1

D D Even

R2

First stage

Second stage

HP output

LP output

HP

K

input


SIP0406-5

One way of improving this is by pipelining which

results in a significant increase in the number of

registers. For instance, to pipeline the lifting-based

(9,7) filter such that the critical path is Tm + 2Ta, 6

additional registers are required. C.T. Huang,[1]

proposed a very efficient way of solving the timing

accumulation problem The basic idea is to remove

the multiplications along the critical path by scaling

the remaining paths by the inverse of the multiplier

coefficients. Fig.7 (a)–(b) describes how scaling at

each level can reduce the multiplications in the

critical path. . The critical path is now Tm + 5Ta.

The minimum critical path of Tm can be achieved

by 5 pipelining stages using 11 pipelining registers

(not shown in the figure). Detailed hardware

analysis of lossy (9, 7), integer (9, 7) and (6, 10)

filters have been included in [1]. Furthermore,

since the flipping transformation Changes the

round-off noise considerably, techniques to address

precision d noise problems have also been

addressed in [1].

Input R1 R0 R2 R0 R1 R0 R2 R0 R1

First stage R3 R5 R3 R5

Second stage R4 R6 R4 R6 R4

HP output R7 R8 R7 R8 1/K HP

LP output

K LP

Fig. 6 Data-flow and registers allocation of the MAC based architecture

z-1

z-1

α α z-1 1 1/α 1/α

z-1

β β z-1 1 1/β 1/β

z-1

γ γ z-1 1 1/γ 1/γ

z-1

δ δ 1 1/δ 1/δ

(a) 1/K K 1/d

HP LP 1/K K

HP LP

Fig 7 A flipping architecture [1]. (a) Original architecture, (b) Scaling the coefficients to reduce the number of

multiplications .

1/β

1/α

1/γ

1/δ

Phase1 Phase2 Phase1 Phase2


SIP0406-6

3.5 Efficient folded architecture [8]

However, the conventional lifting scheme adopts

the serial operation to process these intermediate

data; thus, the critical path latency is very long. We

know that the way of processing the intermediate

data determines the hardware scale and critical path

latency of the implementing architecture. Since

some intermediate data are on different paths, we

can calculate them in parallel. With this parallel

operation, the critical path latency is reduced, and

the number of registers is decreased. Therefore it is

called as efficient folded. The critical path latency

is reduces up to Tm+Ta.

4. Comparison of performances

We can compare the performances of different

architecture on the basis of hardware requirement

and critical path latency. The hardware complexity

has been described in terms of data path

components. Comparison of different architecture

shown in table I

Table I [8] (FOR 9/7 LIFTING-BASED DWT)

Architecture Multiplier Adder Register Critical path

latency

Control

complexity

Throughput

Rate(per

cycle)

Direct 4 8 6 4Tm+8Ta Simple 2

input/output

Direct +full

pipeline

4 8 32 Tm simple 2

input/output

Folded 2 4 12 Tm+2Ta Medium 1

input/output

Flipping 4 8 4 Tm+5Ta complex 2

input/output

Flipping+

5stage

pipeline

4 8 11 Tm complex 2

input/output

Efficient

folded

2 4 10 Tm+Ta Medium 1

input/output

5. Conclusion

In this paper, we presented comparison of the

existing lifting based implementations of 1-

dimensional Discrete Wavelet Transform. We

briefly described the principles behind the lifting

scheme in order to better understand the different

implementation styles and structure. We provided a

systematic derivation of each architecture and

evaluated them with respect to their hardware and

timing requirements.

Reference

[1] C.T. Huang, P.C. Tseng, and L.G. Chen,

―Flipping Structure: An Efficient VLSI

Architecture for Lifting-Based Discrete Wavelet

Transform,‖ in IEEE Transactions on Signal

Processing, 2004, pp. 1080–1089.

[2] C.J Lian, K.F. Chen, H.H. Chen, and L.G.

Chen, ―Lifting Based Discrete Wavelet Transform

Architecture for JPEG2000,‖ in IEEE International

Tm denotes latency of multiplication, Ta denotes latency of adder


SIP0406-7

Symposium on Circuits and Systems, Sydney,

Australia, 2001, pp. 445–448.

[3] W.H. Chang, Y.S. Lee, W.S. Peng, and C.Y.

Lee, ―A Line-Based, Memory Efficient and

Programmable Architecture for 2D DWT Using

Lifting Scheme,‖ in IEEE International Symposium

on Circuits and Systems, Sydney, Australia, 2001,

pp. 330–33

[4] S. Mallat, ―A Theory for Multiresolution Signal

Decomposition: The Wavelet Representation,‖

IEEE Trans. Pattern Analysis and Machine

Intelligence, vol. 11, no. 7, 1989, pp. 674–693.

[5] T. Acharya and P. S. Tsai, JPEG2000 Standard

for Image Compression: Concepts, Algorithms and

VLSI Architectures. John Wiley & Sons, Hoboken,

New Jersey, 2004.

[6] I. Daubechies and W. Sweldens, ―Factoring

Wavelet Transforms into Lifting Schemes,‖ The J.

of Fourier Analysis and Applications, vol. 4, 1998,

pp. 247–269.

[7] C.C. Liu,Y.H. Shiau, and J.M. Jou, ―Design and

Implementation of a Progressive

Image Coding Chip Based on the Lifted Wavelet

Transform,‖ in Proc. of the 11th VLSI Design/CAD

Symposium, Taiwan, 2000

[8] Weifeng Liu, Li Zhang, and Fu Li ―An Efficient

Folded Architecture for Lifting-Based Discrete

Wavelet Transform‖ IEEE TRANSACTIONS ON

CIRCUITS AND SYSTEMS—II: EXPRESS

BRIEFS, VOL. 56, NO. 4, APRIL 2009.

[10] Xiaonan Fan,Zhiyong pang,DihuChen ,H.Z.

Tan ― A Pipeline Architecture for 2-D Lifting-

based Discrete Wavelet Transform of JPEG2000‖

supported by the National Natural Science

Foundation of China under grant No. 60874060

/$26.00 ©2010 IEEE.


SIP0407-1

A Novel Approach in Image De-noising for Salt

& Pepper Noise J S Bhat

1, B N Jagadale

2 and Lakshminarayan H K

2

1 Dept. of Physics, Karnatak University, Dharwad, India

Email: [email protected] 2 Dept. of Electronics, Kuvempu University, Shankaragatta, India

Email: [email protected]; [email protected]

Abstract-The de-noising of an image corrupted by

salt and pepper has been a classical problem in

image processing. In the last decade, various

modified median filtering schemes have been

developed, under various signal/noise models, to

deliver improved performance over traditional

methods. In this paper a simple method called

Inerpolate Median Filter (IMF) is proposed to

restore the images corrupted by salt and pepper

noise. The proposed method works better in

preserving image details by suppressing noise. The

experimental results show that the proposed

algorithm outperforms the conventional Median

filter and other algorithms like mimum-

maximumum exclusive mean filter (MMEM),

Adaptive median filtering(AMF) in terms of signal

to noise ratio.

Key words- Image de-noising, Interpolate median

filter, nonlinear filter, salt & pepper noise

I. INTRODUCTION

An image is often corrupted by noise

during its acquisition and transmission.

Image de-noising is used to reduce the noise

while retaining the important features in the

image. Always there exists a tradeoff

between the removed noise and the blurring

in the image. The intensity of impulse noise

has the tendency of being either relatively

high or relatively low, which will degrade

the image quality. Therefore image de-

noising is used as preprocessing to edge

detection, image segmentation and object

recognition etc.

A variety of filtering techniques has been

proposed for enhancing images degraded by

noise. The classical linear digital image

filters, such as averaging lowpass filters,

tend to blur edges and other fine image

details. Therefore nonlinear filters [1, 2] are

most preferred over linear filters due to their

improved filtering performance in terms of

noise suppression and edge preservation.

The standard median (SM) filter [3] is the

one of the most robust nonlinear filters,

which exploits the rank-order information of

pixel intensities within filtering window.

This filter is very popular due to its edge

preserving characteristics and its simplicity

in implementation. Various modifications of

the SM filter have been introduced, such as

the weighted median (WM) [4] filter. By

incorporating noise detection mechanism

into the conventional median filtering

approach, the filters like switching median

filters [5, 6] had shown significant

performance improvement. The median

filter, as well as its modifications and

generalizations[7] are typically implemented

invariably across an image. Examples

include the mimum-maximumum exclusive

mean filter (MMEM)[8], Florencio‟s [9],

Adaptive median filter(AMF)[10]These

filters have demonstrated excellent

performance but the main drawbacks of all

these filters are, they are prone to edge

jitters in the cases where noise density is

high, large widow size results in blurred

images and significant computational

complexity. To solve this problem, a

modified median filter algorithm called

Interpolate Median filter that employs

Interpolated search in determining the

desired central pixel value is proposed.




SIP0407-2

The paper is organized as follows: Section

II gives brief review of mean and median

filtering. The new approach, The Interpolate

Median filter technique is explained in

section III. Experimental results are

presented in section IV. Finally in section V,

we give the conclusion.

II MEAN & MEDIAN FILTERING

MEAN FILTER

Mean filtering is a simple and easy to

implement method of smoothing images, i.e.

it reduces the amount of intensity variation

between one pixel and the next. It is often

used to reduce noise in images.

The idea of mean filtering is simply to

replace each pixel value in an image with

the mean (`average') value of its neighbors,

including itself. The drawback of this

algorithm is, it has the effect of eliminating

pixel values which are unrepresentative of

their surroundings. With salt and pepper

noise, image gets smoothed with a 3×3

mean filter. Since the shot noise pixel values

are often very different from the surrounding

values, they tend to significantly distort the

pixel average calculated by the mean filter.

MEDIAN FILTER

The median filter is normally used to

reduce noise in an image like the mean

filter; however, it does well in preserving

useful details in the image. Unlike the mean

filter, the median filter considers each pixel

in the image and instead of simply replacing

the pixel value with the mean of neighboring

pixel values; it is replaced with the median

of those values. The median is calculated by

first sorting all the pixel values from the

surrounding neighborhood into numerical

order and then replacing the pixel being

considered with the middle pixel value. The

median filter, especially with larger window

size destroys the fine image details due to its

rank ordering process. Figure1. illustrates an

example calculation.

Neighborhood values: 115, 119, 120, 123,

124, 125, 126, 127, 150

Median value: 124

Fig. 1. Calculating the median value of a 3x3 pixel

neighborhood. The central pixel value of 150 is rather

unrepresentative of the surrounding pixels and is

replaced with the median value: 124

III INTERPOTATE MEDIAN FILTER

The Interpolate Median filter method

considers each pixel in the image in turn and

looks at its neighbors to decide whether or

not it is representative of its surroundings.

Instead of replacing the pixel value with the

median of neighboring pixel values, it

replaces it with the interpolation of those

values.

The interpolation is calculated by first

sorting all pixel values from surrounding

neighborhood into numerical order and then

replacing the pixel being considered with

the interpolation pixel value. The calculation

of interpolation value is derived from the

110 125 125 130 140

123 124 126 127 136

114 120 150 125 134

118 115 119 123 134

111 116 111 120 131


SIP0407-3

Interpolation search technique used for

searching the elements. We can also call it a

Non- linear filter or order-static filter

because there response is based on the

ordering or ranking of the pixels contained

within the mask. The advantages of this

filter over mean and median filter are, it

gives more robust average than both the

methods, for some pixels in the

neighborhood; it creates new pixel values

like mean filter and for some it will not

create new pixel value like median filter, It

has the characteristics of both filters.

The algorithm uses the fallowing formula

2/])[])[( halaKey (1)

where K is the „key‟, Here we make an

intelligent guess about „key‟ which is the

mid value of the array „a‟, and ][],[ hala are

values of bottom and top elements in the

sorted array.

]))[][/(])[((*)( lahalaKlhlMid (2)

Here value „Mid‟ gives the optimal mid-

point of the array and a[mid] gives the

interpolated value. This interpolated value is

the new value of the pixel

IV EXPERIMENTAL RESULTS

To validate proposed method, the

experiments are conducted on some natural

grayscale test images like Lena, Barbera and

Goldhill of size 512*512 at different noise

levels Table 1, illustrates the PSNRs of the

six de-noising methods. The peak signal-to-

noise ratio (PSNR) in decibels (dB), is

defined as

)(255

log102

dBMSE

PSNR (3)

with 1

0

1

0

2),(),(

1 m

i

n

j

jiKjiImn

MSE

(4)

where I and K being the original image and

denoised image, respectively. Figure 2,

shows the original test images used for

experiments and Figure 3, shows the Lena

image corrupted by salt and pepper noise by

20% (dB).

a b c

Fig.2 The original test images with 512x512 pixels:

(a) Lena; (b) Barbara; (c) Goldhill.

Fig 3. Lena image corrupted by salt & pepper

noise(dB) (20%)


SIP0407-4

Table 1. PSNR Performance of Different Algorithms

for Lena image corrupted with salt and pepper noise

Algorithm Noise Density in dB 10% 20% 30%

MF(3x3) 31.19 28.48 25.45

MF(5x5) 29.45 28.91 28.43

MMEM [8] 30.28 29.63 29.05

Florencio‟s [9] 33.69 32.20 30.95

AMF(5x5) [10] 30.11 28.72 27.84

IMF(Proposed) 33.86 30.59 25.75

V CONCLUSION

In this paper, the proposed algorithm

called Interpolate Median filter employs

Interpolated search in determining the

desired central pixel value. Interpolation

mean filtering is a simple, and easy to

implement, for image de-noising.the

simulation results show that the proposed

method performs significantly better than

many other existing methods

REFERENCES

[1] R. Boyle and R. Thomas Computer Vision: A

First Course, Blackwell Scientific

Publications, 1988, pp 32 - 34. [2] E. Davies Machine Vision: Theory, Algorithms

and Practicalities, Academic Press, 1990, Chap. 3.

[3] I. Pitas and A. N. Venetsanopoulos, “Order

statistics in digital image processing,” Proc.

IEEE, vol. 80, no. 12, pp. 1893–1921, Dec.

1992.

[4] D. R. K. Brownrigg, “The weighted median

filter,” Commun. ACM, vol. 27, no. 8, pp. 807–

818, Aug. 1984. [5] H. Hwang and R. A. Haddad, “Adaptive

median filters : New algorithms and results,” IEEE Trans. Image Process., vol. 4, no. 4, pp.499–502, Apr. 1995.

[6] A. Bovik, Handbook of Image & Video

Processing, 1st Ed. New York: Academic, 2000.

[7] http://homepages.inf.ed.ac.uk [8] W. Y. Han and J. C. Lin, “Minimum–

maximum exclusive mean (MMEM) filter to remove impulse noise from highly corrupted images,” Electron. Lett., vol. 33, no. 2, pp. 124-125, 1997.

[9] T. Sun and Y. Neuvo, “Detail-preserving median based filters in image processing,” Pattern Recognit. Lett., vol. 15, no. 4, pp. 341–347, Apr.1994.

[10] A. Sawant, H. Zeman, D. Muratore, S. Samant, and F. DiBianka, “An adaptive median filter algorithm to remove impulse noise in X-ray and CT images and speckle in ultrasound images,” Proc.SPIE vol. 3661,pp. 1263–1274, Feb. 1999.

http://homepages.inf.ed.ac.uk/


SIP0427-1

Content Based Image Retrieval System for Medical

Images

Prof.K.Narayanan1, Shaista Khan

2

Asst.Professor, Fr.Agnel College of Engg., University of Mumbai, India, [email protected]

Fr.Agnel College of Engg., University of Mumbai, India [email protected]

Abstract: The rapid development of technologies and steady

growing amounts of digital information highlight the need of

developing an accessing system. Content-based image indexing

and retrieval has been an emerging research area from the last

few decades. In this, the project approaches content based image

retrieval using low level features such as color, shape and texture

to investigate samples of blood cells through the images to aid

diagnosing disease by identifying similar cases in a medical

database. Medical images are classified in terms of diseases and

by using query image the relevant image is retrieved along with

the classification of disease. The histogram of red, green, and

blue color components is analyzed. The wavelet decomposition is

also used to analyze texture. In addition, morphological

operations such as opening and closing are applied to analyze

object shape. Lastly, color, texture, and shape in image retrieval

are integrated in order to increase the retrieval accuracy.

Keywords: Text Based Image Retrieval (TBIR), Content Based

Image Retrieval (CBIR)

I. INTRODUCTION

In today world the word knowledge has exchanged its

meaning with the information and hence to the data. In

addition to it the rapid development of technologies in digital

field and computing hardware makes the digital acquisition of

information to be more in demand and popular.

Consequently many digital images are being captured and

stored such as medical images, architectural and engineering

images, advertising, design and fashion images, etc., and as a

result large image databases are being created and used in

many applications. However, the focus of our study is on

medical images in this work. A large number of medical

images in digital format are generated by hospitals and

medical institutions every day. So, how to make use of this

huge amount of images effectively becomes a challenging

problem.

In order to overcome this problem the most common

approach that had been used previously for image retrieval

from a database was Text Based Image Retrieval (TBIR).

But later introduced image retrieval based on content

which is known as Content Based Image Retrieval (CBIR). In

TBIR, all medical images are labeled with text which is

manmade and may be different for individuals for the similar

images. Another drawback of TBIR is that all images

especially medical images are difficult to be described by text.

Drawback of TBIR can be overcome by CBIR.

In CBIR, the features from images are extracted using

different methods. The features include color, texture and

shape. Color histogram is the main method to represent the

color information of the image. A method called the pyramid-

structured wavelet transform for texture classification is used.

The number of oval objects in the query image is calculated

using a simple metric and the images are compared with one

another based on those extracted features. These three features

are integrated into one method to improve the retrieval

efficiency. Those images which have similar features would

have similar content as well. Focus of this project is on

medical diagnosis in which CBIR can be used to detect the

disease by identifying similar cases in a medical database.

II. PROPOSED METHOD

Content-based Image Retrieval (CBIR) consists of

retrieving the most visually similar images to a given query

image from a database of images. CBIR from medical image

databases does not aim to replace the physician by predicting

the disease of a particular case but to assist him/her in

diagnosis. The visual characteristics of a disease carry

diagnostic information and oftentimes visually similar images

correspond to the same disease category. By consulting the

output of a CBIR system, the physician can gain more

confidence in his/her decision or even consider other

possibilities.

However, due to the existence of a large number of

medical image acquisition devices, medical images are

distinct and require a specific design of CBIR systems. The

goals of medical information systems have been defined to

deliver the needed information at the right time, the right place

to the right person in order to improve the quality and

efficiency of care processes. In the medical domain, images

from the same disease class as the query image must be

retrieved in order to help the doctor in diagnosis. The images

in the medical database are labeled by a specialist to ensure

that they are less subjective than those of the generic CBIR.

Figure 1 represents the framework of the CBIR system. This

level of retrieval is based on the primitive features. The

following are some of the primitive features such as

Color

Texture

Shape or the spatial location of image element.

A. COLOR ANALYSIS

Color is one of the most important features that make the

image recognition possible by human. It is a property that

depends on the reflection of light to the eye and the processing

of that information in the brain. Color will be used every day

to differentiate objects, places, etc. where colors are defined in

three dimensional color spaces such as RGB (Red, Green, and

Blue), HSV(Hue, Saturation, and Value) or HSB (Hue,

Saturation, and Brightness). Most image formats use the RGB

color space to store information. Most image formats such as


SIP0427-2

JPEG, BMP, GIF, use the RGB color space to store

information.

Figure: 1 Proposed CBIR System

The RGB color space is defined as a unit cube with red,

green, and blue axes. Thus, a vector with three co-ordinates

represents the color in this space which represents black when

all of them set to zeros and represents white when all three

coordinates are set to 1.

1) Algorithm for Color Analysis:

i. Color histograms of query image and images in a

database are calculated and put them into two

different vectors.

ii. Use this vector to calculate Bhattacharya

coefficient of query image with each image in

data base.

iii. The Bhattacharya coefficient is 1 for completely

similar image and 0 indicates that there is no

similarity in two images. It ranges from 0 to 1.

In CBIR, color histogram is the main method to represent

the color information of the image. A color histogram is a type

of bar graph, where each bar represents a particular color of

the color space being used. A histogram is a probability

density function. It represents discrete frequency distribution

for a grouped dataset, which includes different discrete values

that are grouped into a number of intervals [12]. An image

histogram refers to the probability density function of the

image intensities. This is extended for color images to capture

the intensities of the three-color channels.

In this project the color histograms of query image and

images in a database are calculated and put them into two

different vectors and compare them using Bhattacharya

coefficient. The Bhattacharya coefficient is an approximate

measurement of the amount of overlap between two statistical

samples. The coefficient can be used to determine the relative

closeness of the two samples being considered. n

i

biaiyaCoeffBhattachar1

)( (1)

Where considering the samples a and b, n is the number

of partitions, and ai, bi are the number of members of samples

a and b in the ith

partition. The Bhattacharya coefficient will

range from 0 to 1 where 1 represents the completely similar

image and 0 indicates that there is no similarity in two images

[9].

B) TEXTURE ANALYSIS:

A texture is a measure of the variation of the intensity of a

surface, quantifying properties such as smoothness, coarseness

and regularity. The most popular representation of texture is

Wavelet Transform.A method called the pyramid-structured

wavelet transform for texture classification is used. It

decomposes sub-signals in the low frequency channels

recursively. It is mainly trivial for textures with dominant

frequency channels. For this reason, it is mostly suitable for

signals consisting of components with information

concentrated in lower frequency channels. Since most of the

information exists in lower sub band of the image due to the

natural image properties, the pyramid-structured wavelet

transform is highly sufficient. Using the pyramid structured

wavelet transform, [6] the texture image is decomposed into

four sub images, in low-low, low-high, high-low and high-

high sub-bands. At this point, the energy level of each sub-

band is calculated which is the first level decomposition. In

this study, fifth level decomposition is obtained by using the

low-low sub-band for further decomposition. The reason for

this is the basic assumption that the energy of an image is

concentrated in the low-low band. For this reason the wavelet

function used is the Daubechies wavelet.

1) Algorithm for Texture Analysis:

i. Decompose the image using pyramid –

structures Wavelet Transform (till fifth level

decomposition).

ii. Build a histogram of the transformed image

coefficients in each sub band.

iii. Calculate signature Vector for each image by

concatenation of these histograms.

iv. Compute L1- distance using equation 2 of Query

image with all images in data base.

In order to characterize the image texture at different

scales, the distribution of the wavelet coefficients in each sub

band of such decomposition is characterized by an image

signature. An image signature is defined by building a

histogram of the transformed image coefficients in each sub

band. As images are decomposed with a pyramidal scheme on

Nl levels, they consist of 3 * Nl + 1 sub bands: there are 3 sub

bands of details at each scale l <= Nl (lHH, lHL and lLH) plus

an approximation (NlLL), 3*Nl+1 histograms are thus built.

The signature is a vector formed by the concatenation of these

histograms. The distance used to compare two images Im1

and Im2 based on the L1-distance between histograms or 2

signatures.

The distance measure is given


SIP0427-3

3 11 2

1 2

1

(Im , Im ) ( )lN

t t

t

d t H H (2)

1 2 1 2

1

( ) ( )NB

t t t t

j

H H H j H j

Where ( )n

tH jthe value of the j

th bin of the i

th is normalized

histogram of image n and 1 3 1l

t t N is a set of tunable

weights.

C) SHAPE ANALYSIS

Shape may be defined as the characteristic surface

configuration of an object; an outline or contour. It permits an

object to be distinguished from its surroundings by its outline.

1) Algorithm for cell geometry analysis:

i. Convert the image to black and white in order to

prepare for boundary tracing using

bwboundaries and threshold the image.

ii. Remove the noise.

iii. Find the boundaries.

iv. Determine number of oval objects in Query

image and all the images in database.

Based on the domain in this project which is blood cell

images, the number of round objects in the image needs to be

determined; to achieve this Convert the image to black and

white in order to prepare for boundary tracing using

bwboundaries function in MATLAB.

Then morphological operator such as opening is used to

remove the small connected objects which do not belong to

the objects of interest. The result of area and perimeter of an

object inside each image is used to form a simple metric

indicating the roundness of an object using the following

formula:

2

4

Perimeter

areaMetric

P

(3)

This metric is equal to one only for a circle and it is less

than one for any other shape. The discrimination process can

be controlled by setting an appropriate threshold. Here

threshold is taken 0.7.

The shape is an important feature as diseases are

classified depending on the shape of cell for example Sickle-

cell disease, or sickle-cell anaemia, is an autosomal co-

dominant genetic blood disorder characterized by red blood

cells that assume an abnormal, rigid, sickle shape, so for this

disease.

For cell geometric analysis, once the number of oval

objects in the query image is calculated, its value will be

compared with all the value of number of oval objects in all

the images in database. Then the images which are close to

query images will be displayed.

Then combine result of all three algorithms and then

sorted to give best search result along with disease.

III. RESULT

In our classification system, the ground truth database is

made of 25 blood cell images with two different

classifications. Classification is based on type of disease i.e.,

sickle cell disease and cancer disease.

Sickle Cell disease is hereditary Blood disease

resulting from a single amino acid mutation of the red

blood cells. A blood condition of anemia. People with

sickle cell disease have red blood cells that contain

mostly hemoglobin S, an abnormal type of

hemoglobin. Sometimes these red blood cells become

crescent shaped "sickle shaped".

Cancer of the myeloid line of blood cells,

characterized by the rapid growth of abnormal white

blood cells.

In order to increase the accuracy of retrieval result in

the proposed system, the result of color, texture and cell

geometric are combined so that only images which are

common in all the above three feature extraction will be

shown as final result. The advantages of this system are

high accuracy and precision as well as simplicity of the

algorithm.

Query image is blood cell sample image of patient for

diagnose of disease. Search result shows type of disease

patient is suffering from. If patient is not suffering from

these two diseases then result will be shown as patient is

not suffering.

Figure: 2 Result shows patient is suffering from disease or not.

IV. CONCLUSION

The rapid growth in the sizes of image databases

highlights the need of developing an effective and efficient

retrieval system. This development started with retrieving

images using textual annotation called TBIR but later

introduced image retrieval based on content which is known

as CBIR.CBIR overcome the drawbacks of TBIR

Our focus is on medical diagnosis in which is CBIR can be

used to aid diagnosis by identifying similar past cases in a

medical database of medical images mainly blood cell images.


SIP0427-4

These images are classified in terms of diseases and images

from the same disease class as the query image must be

retrieved in order to help the doctor in diagnosis.

This work investigates the approaches of CBIR based on

the low level features such as color, shape and texture

analysis. In order to increase the accuracy of retrieval result of

color, texture and shape is combined and the result is shown.

To diagnose disease considered 25 blood cell images in the

database which are classified based on type of disease for

example sickle cell disease and cancer disease. For a given

query image, retrieved image from database shows patient is

suffering from which type of disease.

REFERENCES [1] “Old fashion text-based image retrieval uses FCA” by Ahamd, I.;

Taek-Sueng Jang, published in Image Processing, 2003.ICIP

2003.Proceedings.2003 International Conference on Image Processing.

[2] “ Content based medical image retrieval based on pyramid structure

wavelet” by Aliaa.A.A.Youssif*, A.A.Darwish an R.A.Mohamed published in IJCSNS International Journal of Computer Science and

Network Security, VOL.10 No.3, March 2010

[3] “Content-based image retrieval from large medical databases” by Kak, A. Pavlopoulou, C. published in 3D Data Processing Visualization and

Transmission, 2002,Proceedings in First International Symposium.

[4] “An Adaptive, Knowledge-Driven Medical Image Search for Interactive Diffuse Parenchymal Lung Disease Quantification” by Yimo Tao,

Xiang Sean Zhou.

[5] “WEB-BASED MEDICAL IMAGE RETRIEVAL SYSTEM” by Ivica Dimitrovski, Dejan Gorgevik, Suzana Loskovska.

[6] Paper on “Wavelet Optimization for Content-Based Image Retrieval in Medical Database “by G. Quellec M. Lamard, G. Cazuguel B.

Cochener, C. Roux.

[7] “Application of Wavelet Transform and its Advantage Compared to Fourier Transform” by M. Sifuzzaman1, M.R. Islam1 and M.Z Ali

Journal of Physical Sciences, Vol. 13, 2009, 121-134.

[8] “Automatic Detection of Red Blood Cells in Hematological Images Using Polar Transformation and Run-length Matrix” by S. H. Rezatofighi*, A.

Roodaki, R. A. Zoroofi R. Sharifian H. Soltanian-Zadeh published in

ICSP2008 Proceedings. ( 978-1-4244-2179-4/08/$25.00 ©2008 IEEE) [9] “Content-based Image Retrieval for Blood Cells” by Mohammad Reza

Zare, Raja Noor Ainon, Woo Chaw Seng, published in 2009 Third Asia

International Conference on Modelling & Simulation. [10] “Digital Image Search & Retrieval uses FFT Sectors of Color Images”

by H. B. Kekre, Dhirendra Mishra published in International Journal on

Computer Science and Engineering. [11] “Content Based Image Retrieval using Contourlet Transform” by

Ch.Srinivasa rao ,S. Srinivas kumar , B.N.Chatterji in ICGST-GVIP

Journal, Volume 7, Issue 3, November 2007. [12] Paper on “Discrete Wavelet Transforms: Theory and Implementation

“by Tim Edwards.

[13] “A Content-Based Retrieval System for Blood Cells Images” by Woo Chaw Seng and Seyed Hadi Mirisaee in 2009 International Conference

on Future Computer and Communication.

[14] “A CBIR METHOD BASED ON COLOR-SPATIAL FEATURE” by

Zhang Lei, Lin Fuzong, Zhang Bo.

1

AUDIO +

Abhay KumarResearch Scholar at Associated Electronics Research Foundation, Phase-II Noida (U.P.)

[email protected]

Abstract--AUDIO+ is an electronic device that alter how a musical instrument or other audio source sounds and can be best termed as a “Digital Effect Processor”. Some effects subtly "colour" a sound, while others transform it dramatically. Effects can be used during live performances (typically with keyboard, electric guitar or bass) or in the studio i.e. the faithful reproduction of the sound signals is heard when AUDIO+ is used in the audio line.

AUDIO+ has a unique quality to modify the sound signals and make it soothing to every human ear. The device is provided with the control panel of “Volume”, “Bass”, “Treble” and “Balance” to make it desirable for ear sensitive to high and low frequency sound. AUDIO+ is easy to use portable device with single signal input/output port and an internal power supply with batteries.

Keywords: Digital audio players, Digital signal processors, Mixed analog digital integrated circuits, Digital filters, Equalizers, Digital controls.

I. INTRODUCTION

AUDIO+ is all about the musical sound box, which can take the raw mp3, mpeg data and process it digitally. What is interesting that it can sample and play many sound formats starting from sampling rate of 8 kHz to 96 kHz which is more than enough to play any sound format. It improves Sound quality with significant reduction of noise and Dolby sound effects.

II. SYSTEM DESCRIPTION

AUDIO+ is built around the combination of IC’s from Texas Instruments and National Instruments. DRV134 and INA2134 from Texas Instruments are used to design a circuit which enhances sound performance.

This project is supported by Associated Electronics Research Foundation.

Mr. Abhay Kumar is with Associated Electronics Research Foundation, C-53, Phase-II, Noida (U.P.) as a Research Scholar(Phone No.-+919650109759, [email protected])

Very low distortion, low noise, and wide bandwidth provide superior performance in high quality audio applications.

LM1036 of the National Instruments is a DC controlled tone (bass/treble), volume and balance circuit for stereo applications in car radio, TV and audio systems. An additional control input allows loudness compensation to be simply effected.

III. DRV134

DRV134 is a differential output amplifiers that convert a single-ended input to a balanced output pair. These balanced audio drivers consist of high performance op amps with on-chip precision resistors. They are fully specified for high performance audio applications, including low distortion (0.0005% at 1 kHz). Wide output voltage swing and high output drive capability allow use in a wide variety of demanding applications. They easily drive the large capacitive loads associated with long audio cables. Laser-trimmed matched resistors provide optimum output common-mode rejection (typically 68dB), especially when compared to circuits implemented with op amps and discrete precision resistors. In addition, high slew rate (15V/μs) and fast settling time (2.5μs to 0.01%) ensure excellent dynamic response. The DRV134 has excellent distortion characteristics. Noise is below 0.003% throughout the audio frequency range under various output conditions. The gain of 6dB is seen at the output of the differential amplifier.

Fig 1: Gain Vs Frequency graph for DRV134

2

IV. INA2134

INA2134 differential line receivers consisting of high performance op amps with on chip precision resistors. They are fully specified for high performance audio applications and have excellent

LM1036 provide user a compatibility to control the component of sound with the help of multi-turn potentiometer. Graphs given below illustrate the different control operation.

ac specifications, including low distortion (0.0005% at 1 kHz) and high slew rate (14V/ms), assuring good dynamic response. In addition, wide outputvoltage swing and high output drive capability allow use in a wide variety of demanding applications. The dual version features completely independent circuitry for lowest crosstalk and freedom from interaction, even when overdriven or overloaded. The INA2134 on-chip resistors are laser trimmed for accurate gain and optimum common-mode rejection. . It has a unity gain.

Fig 2: Gain Vs Frequency graph for INA2134

V. LM1036

LM1036 has a four control inputs provide control of the bass, treble, balance and volume functions through application of DC voltages from a remote control system or, alternatively, from four potentiometers which may be biased from zener regulated power supply. LM1036 has the following features:

Large volume control range, 75 dB typical Tone control, ±15 dB typical Channel separation, 75 dB typical Low distortion, 0.06% typical for an input

level of 0.3 V RMS High signal to noise, 80 dB typical for an

input level of 0.3Vrms

Fig 3: Volume control LM1036

Fig 4: Tone control LM1036

Fig 5: Balance control LM1036

3

VI. DRV 134 SIMULATION

.

T

Input voltage (V)

0.00 10.00 20.00 30.00 40.00 50.00

Outp

ut

-3.00

-1.50

0.00

1.50

3.00

Fig 8: DC analysis of DRV 134

The Fig 8 shows how the input at DRV134 can be balanced and input line can be modulated.

Fig 6: TINA-TI simulation window for DRV134

The above result shows how a circuit can be built on Tina-TI software of DRV 134.The input to the circuit has to be in the range of 8 kHz-96 kHz and the input voltage should be 200mVrms to 2Vrms. The result can be judged by taking the voltage at the VM1 and VM2. The output is balanced owing to DRV134 acts as a balance modulator.

T

Frequency (Hz)

1 10 100 1k 10k 100k 1M

Outp

ut

nois

e (

V/H

z?)

0.00

10.00u

20.00u

30.00u

Fig 7: Noise analysis of DRV 134

The above figure shows the noise analysis of the DRV134 circuit. The noise significantly reduces as the frequency increases.

VII. SIMULATION OF DRV 134 WITH

INA 137

Fig 9: TINA-TI simulation window for DRV134 and INA

137

The above diagram shows that how the balanced output can be amplified and two channels can be made using INA137 (Gain=1/2) and INA134 (Gain=1).

4

T

Frequency (Hz)

1 10 100 1k 10k 100k 1M

Outp

ut

nois

e (

V/H

z?)

0.00

100.00n

200.00n

300.00n

400.00n

Fig 10: Analysis of DRV 134 with INA 137

The above graph shows how the noise can be significantly reduced after the introduction of INA137 or INA134. This shows that how the input signal can be balanced and amplified to reduce the noise affect to the desired one.

T

Input voltage (V)

0.00 25.00 50.00 75.00 100.00

Voltage (

V)

-3.00

-1.50

0.00

1.50

3.00

Fig 11: DC analysis of DRV 134 with INA 2137

The above fig 11 shows that the output voltage range between 200mVrms to 2Vrms and the sampling frequency of 8 kHz to 96 kHz.

VIII. CONCLUSION

AUDIO+ maintains the originality of five major components of sound signals:

a. Pitch: the frequency of sound signals. Low frequencies (Bass): Make

the sound powerful. Midrange frequencies: Give

sound its energy. Human beingare more sensitive to midrange frequencies.

High frequencies (Treble): Give sounds its presence and life like quality and lets us feel that we are close to sound source.

b. Timbre: Timbre is that unique combination of fundamental frequency, harmonics, and overtones that give each voice, musical instrument, and sound effect its unique colouring and character.

c. Harmonics: When a object vibrates it propagates sound waves of a certain frequency. This frequency, in turn, sets in motion frequency waves called harmonics.

d. Loudness: The loudness of a sound depends on the intensity of the sound stimulus.

e. Rhythm: Rhythm is a recurring sound that alternates between strong and weak elements.

In combination to the above all components of sound present AUDIO+ concentrate on the high frequencies with 6dB overall gain and gives presence of the original reproduction of sound and thus it is more useful for high quality audio system and long distance telephonic calls.

IX. FUTURE WORK

The AUDIO+ has a great advantage in audio system and audio communication. That’s why an opportunity to use in digital communication and VOIP phone.

X. REFRENCES

1) Software support and information about the digital speakers reveal from: Texas Instrument ( www.TI.com)

2) Audio www.ti.com/audio3) Data Converters dataconverter.ti.com4) DSP dsp.ti.com 5) Digital Control www.ti.com/digitalcontrol6) Clocks and Timers www.ti.com/clocks7) Logic logic.ti.com 8) Power Mgmt power.ti.com9) Microcontrollers microcontroller.ti.com10) Hardware support from: Farnell India (http://in.farnell.com/)11) Audio codec www.ti.com/tlv320aic3101.pdf12) Audio digital processor www.ti.com/tas3103.pdf13) Audio line driver www.ti.com/drv134.pdf14) Input amplifier www.ti.com/ina2134.pdf15) Voltage regulator www.ti.com/tps62007.pdf,

www.ti.com/tps74801.pdf, www.ti.com/tps74701.pdf,16) Control IC www.national.com


SIP0502-1

Speaker Identification

Prerana & Aditi Choudhary

Abstract-Humans use voice recognition

everyday to distinguish between speakers

and genders. Other animals use voice

recognition to differentiate among sounds

sources Speaker recognition is the process

of automatically recognizing who is

speaking on the basis of individual

information included in speech waves. This

technique makes it possible to use the

speaker's voice to verify their identity and

control access to services such as voice

dialing, banking by telephone, telephone

shopping, database access services,

information services, voice mail, security

control for confidential information areas,

and remote access to computers

Speaker identification has been a wide and

attractive area of research. Many works

based on speech features, were proposed. In

a speaker recognition system there are three

important components; the feature extraction

component, the speaker models and the

matching algorithm.

The speech signal conveys information

about the identity of the speaker. The area of

speaker identification is concerned with

extracting the identity of the person

speaking the utterance. As speech

interaction with computers becomes more

pervasive in activities such as the telephone,

financial transactions and information

retrieval from speech databases, the utility

of automatically identifying a speaker is

based solely on vocal characteristic.

FEATURES OF SPEECH

One might wonder what information is

needed to classify between genders or to

classify the speech of multiple speakers. In

fact, speech contains a great deal of

information that allows a listener to

determine both gender and speaker identity.

In addition, speech can reveal much about

the emotional state and age of the speaker.

For example, an Israeli engineer created a

signal processing lie detector system that out

performs the traditional polygraph test.

PITCH

Pitch is the most distinctive difference

between male and female speakers. A

person’s pitch originates in the vocal

cords/folds, and the rate at which the vocal

folds vibrate is the frequency of the pitch.

So, when the vocal folds oscillate at 300

times per second, they are said to be

producing a pitch of 300 Hz. When the air

passing through the vocal folds vibrates at

the frequency of the pitch, harmonics are

also created. The harmonics occur at integer

multiples of the pitch and decrease in


SIP0502-2

amplitude at a rate of 12 dB per octave – the

measure between each harmonic .

The reason pitch differs between sexes is the

size, mass, and tension of the laryngeal tract

which includes the vocal folds and the

glottis (the spaces between and behind the

vocal folds). Just before puberty, the

fundamental frequency, or pitch, of the

human voice is about 250 Hz, and the vocal

fold length is about 10.4 mm. After puberty

the human body grows to its full adult size,

changing the dimensions of the larynx area.

The vocal fold length in males increases to

about 15-25 mm while female’s vocal fold

length increases to about 13-15 mm. These

increases in size correlate to decreased

frequencies coming from the vocal folds. In

males, the average pitch falls between 60

and 120 Hz, and the range of a female’s

pitch can be found between 120 and 200 Hz.

Females have a higher pitch range than

males because the size of their larynx is

smaller. However, these are not the only

differences between male and female speech

patterns .

FORMANT FREQUENCIES

When sound is emitted from the human

mouth, it passes through two different

systems before it takes its final form. The

first system is the pitch generator, and the

next system modulates the pitch harmonics

created by the first system. Scientists call the

first system the laryngeal tract and the

second system the supralaryngeal/vocal

tract. The supralaryngeal tract consists of

structures such as the oral cavity, nasal

cavity, velum, epiglottis, tongue, etc.

When air flows through the laryngeal tract,

the air vibrates at the pitch frequency

formed by the laryngeal tract as mentioned

above. Then the air flows through the

supralaryngeal tract, which begins to

reverberate at particular frequencies

determined by the diameter and length of the

cavities in the supralaryngeal tract. These

reverberations are called “resonances” or

“formant frequencies”. In speech,

resonances are called formants. So, those

harmonics of the pitch that are closest to the

formant frequencies of the vocal tract will

become amplified while the others are

attenuated

INTRODUCTION- Most signal processing

involves processing a signal without concern

for the quality or information content of that

signal. In speech processing, speech is

processed on a frame by-frame basis usually

only with the concern that the frame is either

speech or silence The usable speech frames

can be defined as frames of speech that

contain higher information content

compared to unusable frames with reference

to a particular application. We have been

Input

speech

Feature

extraction

Reference

model

(Speaker #1)

Similarity

Reference

model

(Speaker #N)

Similarity

Maximum

selection

Identification

result

(Speaker ID)


SIP0502-3

investigating a speaker identification system

to identify usable speech frames. We then

determine a method for identifying those

frames as usable using a different approach.

However, knowing how reliable the

information is in a frame of speech can be

very important and useful.

This is where usable speech detection and

extraction can play a very important role.

The usable speech frames can be defined as

frames of speech that contain higher

information content compared to unusable

frames with reference to a particular

application. We have been investigating a

speaker identification system to identify

usable speech frames .We then determine a

method for identifying those frames as

usable using a different approach.

PARADIGMS OF SPEECH

RECONGITION

1. Speaker Recognition - Recognize which

of the population of subjects spoke a given

utterance.

2. Speaker verification -Verify that a given

speaker is one who he claims to be. System

prompts the user who claims to be the

speaker to provide ID. System verifies user

by comparing codebook of given speech

utterance with that given by user. If it

matches the set threshold then the identity

claim of the user is accepted otherwise

rejected.

3. Speaker identification - detects a

particular speaker from a known population.

The system prompts the user to provide

speech utterance. System identifies the user

by comparing the codebook of speech

utterance with those of the stored in the

database and lists, which contain the most

likely speakers, could have given that

speech utterance.

At the highest level, all speaker recognition

systems contain two main modules (refer to

Figure 1): feature extraction and feature

matching. Feature extraction is the process

that extracts a small amount of data from the

voice signal that can later be used to

represent each speaker. Feature matching

involves the actual procedure to identify the

unknown speaker by comparing extracted

features from his/her voice input with the

ones from a set of known speakers.

Reference

model

(Speaker #M)

SimilarityInput

speech

Feature

extraction

Verification

result

(Accept/Reject)Decision

ThresholdSpeaker ID

(#M)


SIP0502-4

(b) Speaker verification

Figure 1. Basic structures of speaker

recognition systems

Figure 1 shows the basic structures of

speaker identification and verification

systems. The system that we will describe is

classified as text-independent speaker

identification system since its task is to

identify the person who speaks regardless of

what is saying.

Concepts of speaker identification

systems:

Speaker identification systems may be

classified into two categories based on their

principle of operation.

Text-dependent systems, which make use of

a fixed utterance for test and training and

rely on specific features of the test utterance

in order to affect a match.

Text-independent systems, which make use

of different utterances for test and training

and rely on long term statistical

characteristics of speech for making a

successful identification.

Text-dependent systems require less training

than text-independent systems and are

capable of producing good results with a

fraction of the test speech sample required

by a text-independent system. The pitch

period or fundamental frequency of speech

varies from one individual to another; pitch

frequency is high for female voices and low

for male voices. This suggests that pitch

might be a suitable parameter to distinguish

one speaker from another, or at least to

narrow down the set of probable matches.

The analysis of the frequency spectrum of

the test utterance provides valuable

information about speaker identification.

The spectrum contains both pitch harmonics

and vocal-tract resonant peaks, making it

possible to identify the speaker with a high

probability of being correct. The vocal-tract

filter parameters (filter coefficients) can also

be used to good effect for speaker

identification. This is due to the fact that

different speakers have different vocal-tract

configurations for the same utterance,

depending on their physical and emotional

conditions, as well as whether the speaker is

a native or non-native speaker

In any text-dependent speaker identification

system, an important decision is the choice

of test utterance. The source-filter model is

most accurate at representing voiced sounds,

such as the vowels. Vowels have a definite,

consistent pitch period. The vocal-tract

configuration for vowel-utterances exhibits a

clear formant (resonant) structure. The

frequency spectrum corresponding to vowel-

utterances therefore contains a wealth of

information that can be used for speaker


SIP0502-5

identification. In general, it is difficult to

guarantee a hundred percent recognition

even with the best speaker identification

approaches.

Generally speaking, two parameters may be

used to describe the overall performance of

a speakeridentification system.

A false acceptance: Which occurs when the

system incorrectly identifies an unregistered

individual as an enrolled one, or when one

registered individual is mistaken for another.

The FAR (False Acceptance Ratio) is the

ratio of the number of false acceptances to

the total number of trials. The value of FAR

can be reduced by setting a strict low

threshold.

A false rejection: Which occurs when the

system incorrectly refuses to identify an

individual who is registered with the system.

The FRR (False Rejection Ratio) is the ratio

of the number of false rejections to the total

number of trials. Setting the threshold to a

liberal high value can minimize the value of

FRR. The requirements for low FAR and

FRR are seen to be conflicting and both

parameters cannot be simultaneously

lowered. However, a low FAR is vital for

good speaker identification systems and

most systems are biased for good FAR

performance at the expense of FRR.

APPROACHES TO SPEECH

RECOGNITION

1. The Acoustic Phonetic approach

2. The Pattern Recognition approach

3. The Artificial Intelligence approach

A. The Acoustic Phonetic Approach

The acoustic phonetic approach is based

upon the theory of acoustic phonetics that

postulate that there exist a set of finite,

distinctive phonetic units in spoken

language and that the phonetic units are

broadly characterized by a set of properties

that can be seen in the speech signal, or its

spectrum, over time. Even though the

acoustic properties of phonetic units are

highly variable, both with the speaker and

with the neighboring phonetic units, it is

assumed that the rules governing the

variability are straightforward and can

readily be learned and applied in practical

situations. Hence the first step in this

approach is called segmentation and labeling

phase. It involves segmenting the speech

signal into discrete (in Time) regions where

the acoustic properties of the signal are

representatives of one of the several

phonetic units or classes and then attaching

one or more phonetic labels to each

segmented region according to acoustic

properties.

For speech recognition, a second step is

required. This second step attempts to

determine a valid word (or a string of words)

from the sequence of phonetic labels

produced in the first step, which is

consistent with the constraints of the speech

recognition task

B. The Pattern Recognition Approach


SIP0502-6

The Pattern Recognition approach to speech

is basically one in which the speech patterns

are used directly without explicit feature

determination (in the acoustic – phonetic

sense) and segmentation. As in most pattern

recognition approaches, the method has two

steps – namely, training of speech patterns,

and recognition of patterns via pattern

comparison. Speech is brought into a system

via a training procedure The concept is that

if enough versions of a pattern to be

recognized (be it sound a word, a phrase etc)

are included in the training set provided to

the algorithm, the training procedure should

be able to adequately characterize the

acoustic properties of the pattern (with no

regard for or knowledge of any other pattern

presented to the training procedure) This

type of characterization of speech via

training is called as pattern classification.

Here the machine learns which acoustic

properties of the speech class are reliable

and repeatable across all training tokens of

the pattern. The utility of this method is the

pattern comparison stage with each possible

pattern learned in the training phase and

classifying the unknown speech according to

the accuracy of the match of the patterns

Advantages of Pattern Recognition

Approach

• Simplicity of use. The method is relatively

easy to understand. It is rich in mathematical

and communication theory justification for

individual procedures used in training and

decoding. It is widely used and best

understood.

• Robustness and invariance to different

speech vocabularies, user, features sets

pattern comparison algorithms and decision

rules. This property makes the algorithm

appropriate for wide range of speech units,

word vocabularies, speaker populations,

background environments, transmission

conditions etc.

• Proven high performance. The pattern

recognition approach to speech recognition

consistently provides a high performance on

any task that is reasonable for technology

and provides a clear path for extending the

technology in a wide range of directions.

C. The Artificial Intelligence Approach

The artificial intelligence approach to

speech is a hybrid of acoustic phonetic

approach and the pattern recognition

approach in which it exploits ideas and

concepts of both methods. The artificial

intelligence approach attempts to mechanize

the recognition procedure according to the

way a person applies intelligence in

visualizing, analyzing and finally making a

decision on the measured acoustic features.

In particular, among the techniques used

within the class of methods are the use of an

expert system for segmentation and labeling.

The use of neural networks could represent a

separate structural approach to speech

recognition or could be regarde

as an implementational architecture that may

be incorporated in any of the above classical

approaches.

FUTURE SCOPE


SIP0502-7

A range of future improvements is possible:

• Speech independent speaker identification

• No of user scan be increased

• Identification of a male female child and

adult

REFERENCES

1. R.V Pawar, P.P.Kajave, and S.N.Mali

“Speaker Identification using Neural

Networks”, World Academy of Science,

Engineering and Technology 12 2005.

2. Lawrence Rabiner- “Fundamentals of

Speech Recognition” Pearson Education

Speech Processing Series. Pearson

Education Publication.

3. Brian J. Love , Jennifer Vining , Xuening

Sun “Automatic Speaker Recognition

Using Neural Networks”, Electrical and

Computer Engineering Department The

University of Texas at Austin Spring 2004.

4. Muzhir Shaban Al-Ani, Thabit Sultan

Mohammed and Karim M. Aljebory

“Speaker Identification: A Hybrid Approach

Using Neural Networks and Wavelet

Transform”, Journal of Computer Science 3

(5): 304-309, 2007 ISSN 1549-3636, 2007

Science Publications.


SIP0503-1

Abstract

This Paper focuses on the analysis of the

Film Bulk Acoustic Wave Resonator

(FBAR) comprising of Zinc Oxide (ZnO)

piezoelectric thin film sandwiched

between two metal electrodes of gold (Au)

and located on a silicon substrate with a

low stress silicon nitride (Si3N4)

supporting membrane for high frequency

wireless application. The film bulk

acoustic wave technology is a promising

technology for manufacturing miniaturized

high performance filters for Giga Hertz

range.

Keywords: FBAR, Quartz crystal, APLAC.

Quartz Crystal

Crystal Quartz is the most important

resonator material presently available. It

has been used for 50 years, and thus

growth, characterization, and fabrication

techniques are quite mature. Its low

coupling is usually not a disadvantage

when it is used for frequency control

applications. For reasonable values of

transducer areas, the resistance falls in the

10 –20 ohm range at 5 to 20MHz. This

range is ideal for oscillator circuits. Its Q is

some what lower than that of ferroelectric

materials, but at lower frequencies it is

more than adequate, and because the

stoichmetery of the crystal quartz is simple

and its growth technology well

established, there are a few crystal defects

and the attenuation has frequency squared

dependence. Only when very high

frequencies or wide inductive regions are

required do designers look beyond quartz.

So at higher frequencies e.g. at GHz we

cannot use quartz and FBAR and Saw

devices are used which are much smaller

in size. Quartz also have disadvantage that

it has the limits of the integration with the

mechanical structure and integrated circuit

as compared to silicon and furthermost the

cost of quartz wafers is significantly higher

than that of silicon.[1-7]

FBAR Devices

FBAR stands for Film Bilk Acoustic

Resonator FBAR is a break through

resonator technology being developed by

Agilent technologies.Thus the technology

can be used to create the essential

frequency shooing elements found in

modern wireless systems, including filters,

duplexers and resonators for oscillators.

[1-3]

Why FBAR

The rapid growth of wireless mobile

telecommunication system leads to

increase in demand for high frequency

oscillators, filters and duplexers capable of

operating in GHz frequency band range.

Conventionally Liquid Crystal, microwave

ceramic resonators, transmission lines and

SAW devices have been used as high

frequency band devices. Although they

provide high performance at reasonable

price but they are large in size to be able to

integrate in wireless application. SAW

have better electrical performances and

smaller in size but they had relatively poor

sensitivity to temperature, high insertion

losses and limited power handling.

To cope with these limitations FBAR

devices have been developed and can

easily replace these devices in higher

frequency for wireless communication

applications.A thin film bulk acoustic

wave resonator consists basically of a thin

piezoelectric layer sandwiched between

two electrodes. In such a resonator a

mechanical wave is piezoelectrically is

excited in response to an electric field

applied between the electrodes. The

propagation direction of this acoustic wave

is perpendicular to the surface of the

resonator. For a standing wave situation to

Modeling of FBAR Resonator and Simulation using APLAC

Deepak kumar, Navaid Z.Rizvi,Rajesh Mishra

Gautam Buddha University,Greater Noida

[email protected]


SIP0503-2

prevail, the acoustic energy has to be

reflected back at the boundaries of the

resonator. This reflectivity can be achieved

by two means, either an air-interface or an

acoustic mirror. Piezoelectric thin films

convert electrical energy into mechanical

energy and vice versa. Film Bulk Acoustic

Resonator (FBAR) consists of a

piezoelectric thin film sandwiched by two

metal layers. A resonance condition occurs

if the thickness of piezoelectric thin film

(d) is equal to an integer multiple of a half

of the wavelength (λres). The fundamental

resonant frequency (Fres=1/ λres) is then

inversely proportional to the thickness of

the piezoelectric material used, and is

equal to Va/2d where Va is an acoustic velocity at the resonant frequency (Fig. 1).

Figure.1

Figure.2

Figure.3

A bulk-micro machined FBAR with TFE

(Thickness Field Excitation) uses a z-

directed electric field to generate z-

propagating longitudinal or compressive

wave.[3-8]

In an LFE-FBAR, the applied electric

field is in y-direction, and the shear

acoustic wave (excited by the lateral

electric field) propagates in z-direction.

One Dimensional Acoustic-Wave

Equation:

The fundamental wave equation related to

the longitudinal acoustic-wave generation

and propagation for one dimensional case

is

Where T, S, c and mo are the mechanical

stress, the mechanical strain, the stiffness

elastic constant and the mass density of the

material, respectively.

From the Hooke’s law

T= c*S (2)

solution of the wave equation for the stress

contain (as common factors) e- j( t+-kz)

where =2pf is the wave frequency and

the k is the propagation constant of the

wave number

K = ( mo /c)1/2

(3)

= /Va

Where Va= (c/ mo)1/2

being the acoustic

impedance .

The acoustic impedance is

Z = -T/v (4)

Where v = –Va T/c is the particl velocity.

Hence, the acoustic impedance Z0 is

Z0 = c/ Va= ( mo c)1/2

= Va mo (5)

Three-port equivalent circuit model:

Consider the two lateral dimensions (in

x and y direction) of the uniform

resonator are very large compared with

the thickness and acoustic. the metallic

electrodes are assumed to be very thin,

providing no mass loading on opposite

surface normal to the z direction.

(1)


SIP0503-3

The following equivalent circuit models

are used widely for FBAR electrical

modeling.

1. Mason equivalent circuit model

2. Redwood equivalent circuit model

3. KLM equivalent circuit model

In this paper the Mason three Port

Equivalent circuit model have been used.

Mason equivalent circuit model Mason’s model has been accepted most

widely used in analyzing vertical structure

of the piezoelectric materials. It is based

on a physical model and uses as its inputs,

dielectric constants, mass densities

constants, mass densities, stiffness

coefficients, from the piezoelectric stress

tensor and thickness of the physical layers.

The model is used for calculating the

fundamental frequency of the resonator as

well as calculating the effective kt2

of the

devices. The vibration characteristics of

the piezoelectric structure can be modeled

as a three port network with one electric-

input and two acoustic output ports. Owing

to the network with one electric-input port

and two acoustic ports. Owing to the

characteristics of the piezoelectric

transducer driven from the coupling of the

electric potential and mechanical stress.

The forces (F) and the particle velocities

(v) at the boundary surfaces of the

resonator are:

F1 = - AT (-d/2) (6) F2= - AT (d/2) (7) v1= v (-d/2) (8) v2= v (-d/2) (9) minus (-) sign indicates the relation of

axis and direction,v1 and v2 means particle

velocities vector exist in the material

surface.Where A, d and T are the area,

thickness and internal stress of the

resonator, respectively

k = ( m/cD)1/2

= /Va (10)

Z0 = ( m cD)1/2

= m cD

(11) TF = - Z0 vF (12)

TB = Z0 vB (13)

Va= (cD / m)

1/2 (14)

Va is being the acoustic wave velocity.

Using boundary conditions : V = -v2sin[k(z+d/2)]+v1sin[k(d/(2-z))]/sin(kd)

(15)

By evaluating the above equations Mason

model of a piezoelectric transducer

(resonator) is obtained.

Where C0 is called the clamped (zero

strain) capacitance or static capacitor of a

transducer (resonator), is

C0=S

A/ d (16)

and Zc is the acoustic impedance of a

transducer with the area A, is

Zc = AZ0 =A m cD

(17)

The matrix equation results can used

to represent in the Mason model

equivalent circuit.

Figure.4

As shown in Fig.4 in this equivalent

circuit , the electric port of transformer

represents the conversion of electrical

energy to acoustic energy, the electric

port of the transformer represents the

conversion of electrical energy to

acoustic energy (or vice versa).


SIP0503-4

Why APLAC

With the help of APLAC Circuit

simulation and design tool, any RF or

analog circuit can be easily simulated with

a wide range of analysis methods.

Moreover, optimization, tuning and a

Monte Carlo statistical feature (for design

yield) are available with every analysis

methods. Through APLAC it is possible to

easily simulate miniaturized structures and

complex system. Device models developed

for large devices are inapplicable when

nano-scale physical phenomena enter into

play.

Simulation Results

Firstly simulated a ZnO FBAR structure in

Aplac8.1 version. The FBAR is having lay

an upper and bottom electrode of Au and a

membrane layer of Si3N4 for support.

Then calculated the resonance frequency

analytically and then analyzed the

simulated result which is approximately

the same.

Simulation of ZnO FBAR Here used the one dimensional Mason

Model and the basic transmission line

theory to simulate the FBAR which has

ZnO as the piezoelectric material and the

Au as the top and bottom electrodes and

for the membrane we used Si3 N4 as the

material. The circuit diagram is shown in

Fig.5.For the top and bottom electrode and

membrane layer used the transmission line

model. But for the piezoelectric layer the

one dimensional Mason Model used. The

results of simulation are shown in Fig.6

and Fig.7. Fig.7 shows S21 (both

magnitude and phase) and Fig.6 shows S11

(both magnitude and phase). If we analyze

Fig.7 we can easily see the resonance at

the expected frequency.

Obtained and taken values in the

simulation are given in table.1:

Table.1

Ar

ea

(F

B

A

R)

Thi

ck

nes

s

of

Zn

O

S

21

Mi

n

S

11

Mi

n

fp fs k

eff2

Q F

O

M

45

u

m2

1.2

um

-

6

1

d

B

-

0.

3

d

B

2.5

93

GH

z

2.6

21

GH

z

0.

0

2

6

1

5

0

0

0

3

9

0

Figure.5 Circuit Simulated in Aplac


SIP0503-5

Figure.6 .FBAR Resonator S (1, 1)

Figure.7 .FBAR Resonator S (2, 1)

Figure.8 Smith Chart showing S (2, 1) and

S (1,1)

It also analyzed the influence of different

piezoelectric films and electrode materials

on the characteristics of a thin film bulk

acoustic resonator (FBAR). The results

confirm that the material properties and

thicknesses of piezoelectric film play a

significant role in determining the

performance of FBAR, and influence such

characteristics such as Resonance

frequency, the bandwidth and the insertion

loss. Since the results demonstrate that the

thicknesses of each of the layers within the

acoustic wave path, and by the resonance

area, the potential exists to tune the

characteristics of the FBAR by specifying

appropriate geometric parameters during

the FBAR design stage.

Effect of using different piezoelectric

material:

For example using AlN as the piezoelectric

material then the resonance frequency

from around 2.62GHz to 4.7GHz will be

increased using the same area and

thickness for both cases. As depicted

below Q factor and FOM of ZnO FBAR is

more then AlN FBAR hence it is better

FBAR in terms of performance. The

comparisons are shown in table.2. The

results of simulation are shown in Fig.9

and Fig.10 and fig.11. Fig.9 shows S21

(both magnitude and phase) and Fig.10

shows S11 (both magnitude and phase). If

we analyze Fig.9 we can easily see the

resonance at the expected frequency.

Table.2

1.500G 1.875G 2.250G 2.625G 3.000G

-0.50

-0.13

0.25

0.63

1.00

-180.00

-90.00

0.00

90.00

180.00

ZnO FBAR Area 45usq.m d=1.2um

APLAC 8.10 Student version FOR NON-COMMERCIAL USE ONLY

dB

f/Hz

PHASE

MagdB(S(1,1)) Pha(S(1,1))

1.500G 1.875G 2.250G 2.625G 3.000G

-63.00

-54.25

-45.50

-36.75

-28.00

-180.00

-90.00

0.00

90.00

180.00



dB

f/Hz

PHASE


0.5

-0.5

2.0

-2.0

0.0 0.2 1.0 5.0



Im(S(1,1)) Im(S(2,1))


SIP0503-6

Figure.9 AlN FBAR Resonator S21

Figure.10 AlN FBAR Resonator S11

Figure.11 Smith Chart showing S (2, 1)

and S(1,1)

Conclusion

Result shows that the resonant frequency

of the FBAR depends upon the particular

choice of the piezoelectric material. It also

demonstrated that the FBAR performance

is influenced by the physical dimensions

of the device, including the thickness of

the piezoelectric film, electrode,

membrane layer, and by the resonance area

size. It is possible to calculate the effective

coupling coeffient, Q factor and figure of

merit. In this way it is possible to specify

suitable parameter values, which will

optimize the design of the FBAR, and

which can be used in designing FBAR

devices that will operate within a specified

frequency range.

Refrences

(1)K.M Lakin and G.R Kline and K.T

MCArron,” High –Q microwave acoustic

resonators and filters,” IEEE transactions

microwave theory and techniques,vol.41.

(2) S.V Krishnaswamy , J. Rosenbaum ,S.

Horwitz ,C.Vale and R.A. Moore ,” Film

Bulk acoustic wave resonator technology

,” Proceedings of the IEEE ultrasonic

Symposium, Honolulu, HI, USA, 1990.

(3)P.J Yoon GW,” Fabrication of ZnO-

based film bulk acoustic resonator devices

using W/SiO2 multilayer reflector,”

Electronics letters, vol.36 (16).

(4)K.M.Lakin and J.S. Wang,”UHF

composite bulk wave resonator” Ultrasonic

Symposium ,1990.

(5)W.P Mason, Physical Acoustic

Principles and Methods, Vol.1A,

Academic press, New York.

(6) G. G. Fattinger, J. Kaitila, R. Aigner,

W. Nessler,” Single-to-balanced Filters for

Mobile Phones using Coupled Resonator

BAW Technology”,IEEE International

Ultrasonics, Ferroelectrics and Frequency

Control Symposium, 2004.

(7)K. M. Lakin, “Thin film resonator

technologies”, IEEE Trans. UFFC,vol.52,

pp. 707-716, May 2005.

(8)F. Constantinescu. M. Nitescu, A. G.

Gheorghe, “New circuit models for power

BAW resonators “, in Proc. .ICCSC

Shanghai, China, pp.176-179,2008.

3.500G 3.875G 4.250G 4.625G 5.000G

-60.00

-49.50

-39.00

-28.50

-18.00

-180.00

-90.00

0.00

90.00

180.00

AlN FBAR Area=45um d=1.2um


dB

f/Hz

PHASE


3.500G 3.875G 4.250G 4.625G 5.000G

-0.80

-0.48

-0.15

0.18

0.50

-180.00

-90.00

0.00

90.00

180.00



dB

f/Hz

PHASE


0.5

-0.5

2.0

-2.0

0.0 0.2 1.0 5.0



Im(S(1,1)) Im(S(2,1))


SIP0506-1

Role of Speech Scrambling and Encryption in Secure Voice Communication

Himanshu Gupta Faculty Member, Amity Institute of Information

Technology, Amity University Campus,

Sector – 125, Noida (Uttar Pradesh), India.

E-mail: [email protected]

Prof. (Dr.) Vinod Kumar Sharma

Professor & Dean, Faculty of Technology,

Gurukula Kangri Vishwavidyalaya,

Haidwar, India

E-mail: [email protected]

Abstract— Security of speech is a challenging

issue of voice communications today that requires

speech scrambling and encryption techniques. The

rapid development in information technology, the

demand of secure transmission of voice over

wireless communication channel is increasing day

by day. The conventional methods of voice

communication can’t provide adequate security

from intruder. The voice data may be accessed by

the unauthorized user for malicious purpose.

Therefore, it is necessary to apply effective

scrambling and encryption techniques to enhance

voice security. The speech scrambling and

encryption technique can provide sufficient

security over wireless media. In this research

paper, various effective speech scrambling and

encryption techniques are proposed. In this

scrambling and encryption technique, original

speech is inverted and encrypted with different

strong scrambling and encryption methods. This

scrambling and encryption technique enhances the

security of voice over insecure communication

channel at large extent.

Keywords-Speech Scrambling; Speech

Encryption ; Secure Voicey; Communication

Channel..

I. INTRODUCTION

A secure voice communication is a process that

allows for the secure transmission of voice

communications between a sending and a

receiving node over wireless communication

channel. This process uses various scrambling and

encryption techniques which are capable of

inversion and encryption of speech in effective

manner.

When two entities are communicating with each

other, and they do not want a third party to listen

to their communication, then they want to pass on

their message in such a way that nobody else

could understand their message. This is known as

communicating in a secure manner or Secure

Communication.

Secure communication includes means by which

people can share information with varying

degrees of certainty that third parties cannot know

what was said. Other than communication spoken

face to face out of possibility of listening, it is

probably safe to say that no communication is

guaranteed secure in this sense, although practical

limitations such as legislation, resources, technical

issues such as interception, and the sheer volume

of communication are limiting factors to

surveillance.

II. BACKGROUND

The implementation of voice encryption dates

back to World War II when secure

communication was paramount to the US armed

forces. During that time, noise was simply added

to a voice signal to prevent enemies from listening

to the conversations. Noise was added by playing

a record of noise in synch with the voice signal

and when the voice signal reached the receiver,

the noise signal was subtracted out, leaving the

original voice signal. In order to subtract out the

noise, the receiver need to have the exact same

noise signal and the noise records were only made

in pairs; one for the transmitter and one for the


SIP0506-2

receiver. Having only two copies of records made

it impossible for the wrong receiver to decrypt the

signal. To implement the system, the army

contracted Bell Laboratories and they developed a

system called SIGSALY. With SIGSALY, ten

channels were used to sample the frequency

spectrum from 250 Hz to 3 kHz and two channels

were allocated to sample voice pitch and

background hiss. In the time of SIGSALY, the

transistor had not been developed and the digital

sampling was done by circuits using the model

2051 Thyratron vacuum tube. Each SIGSALY

terminal used 40 racks of equipment weighing 55

tons and filled a large room. This equipment

included radio transmitters and receivers and large

phonograph turntables. The voice was keyed to

two 16-inch vinyl phonograph records that

contained a Frequency Shift Keying (FSK) audio

tone. The records were played on large precise

turntables in synch with the voice transmission[1].

From the introduction of voice encryption to

today, encryption techniques have evolved

drastically. Digital technology has effectively

replaced old analog methods of voice encryption

and by using complex algorithms; voice

encryption has become much more secure and

efficient. One relatively modern voice encryption

method is Sub-band coding. With Sub-band

Coding, the voice signal is split into multiple

frequency bands, using multiple bandpass filters

that cover specific frequency ranges of interest.

The output signals from the bandpass filters are

then lowpass translated to reduce the bandwidth,

which reduces the sampling rate. The lowpass

signals are then quantized and encoded using

special techniques like, Pulse Code

Modulation (PCM). After the encoding stage, the

signals are multiplexed and sent out along the

communication network. When the signal reaches

the receiver, the inverse operations are applied to

the signal to get it back to its original state.

Motorola developed a voice encryption system

called Digital Voice Protection (DVP) as part of

their first generation of voice encryption

techniques. "DVP uses a self-synchronizing

encryption technique known as cipher feedback

(CFB). The basic DVP algorithm is capable of

2.36 x 1021

different "keys" based on a key length

of 32 bits." The extremely high amount of

possible keys associated with the early DVP

algorithm, makes the algorithm very robust and

gives the user a high level of security. As with any

voice encryption system, the encryption key is

required to decrypt the signal with a special

decryption algorithm[2].

III. OVERVIEW OF THE PROPOSED SPEECH

SCRAMBLING TECHNIQUE

Speech inversion is a very common method of

speech scrambling, probably because its the

cheapest. Speech inversion works be taking a

signal and turning it 'inside out', reversing the

signal around a pre-set frequency. Speech

inversion can be broken down into three types,

base-band inversion (also called 'phase

inversion'), variable-band inversion (or 'rolling

phase inversion') and split band inversion. Images

will be used to help clarify what different

inversion systems do.

Fig 1: The non-scrambled sound wave

Base band inversion inverts the signal around

a pre-set frequency that never changes. Because

of this, base-band inversion is useless. Because

the inverting frequency never changes, running

the frequency through another inverter set on the

same frequency unscrambles it. Descrambling

baseband inversion is simple. Take the scrambled

input and re-invert it around the same inversion

point used to scramble it.


SIP0506-3

Fig 2: Base Band Inversion of Sound wave

Variable-band inversion inverts the signal around

a constantly varying frequency, making

decryption possible, but not bloody likely.

Variable band inversion can be identified by the

burst of modem noise at the beginning of the

transmission (it’s a 1200 bps carrier) and the

repeated clicking sounds as the inverting

frequency changes. Descrambling variable band

inversion would be a chore for the amateur

eavesdropper, as the inversion point changes

every fraction of a second. Professionals however

would likely have little trouble extracting clear

speech.

Split-band inversion is another method for making

inversion more secure. Split band inversion

divides the signal into two frequencies and inverts

them (usually baseband) separately. Some split

band inversion systems provide enhanced security

by randomly changing the frequency where the

signal is split at given intervals.

Fig 3: Split Base Band Inversion of Sound Wave

IV. OVERVIEW OF THE PROPOSED SPEECH

ENCRYPTION TECHNIQUE

Encryption is a much stronger method of

protecting speech communications than any form

of scrambling. Voice encryptions work by

digitizing the conversation at the telephone and

applying a cryptographic technique to the

resulting bit-stream. In order to decrypt the

speech, the correct encryption method and key

must be used [3]. For Speech or Voice encryption,

we can use any one of the following encryption

methods.

(A) Hardware Based Encryption Systems

Hard encryption systems are voice encryption

schemes that utilize hardware to encrypt

conversations. Hard encryption devices are useful

because they don't need a computer to work

(allowing them to be built into things like radios

and cellular phones), are usually more secure, and

are simpler to use. On the downside, hardware

encryption systems are very expensive and can be

hard to acquire.

(B) Software Based Encryption Systems

Soft encryption systems are exactly what they

sound like, software based encryption. While the

inconvenience of having to use a computer is the

primary drawback to soft voice encryption, most

of the available programs use good crypto and are

free.

(C) Digital Voice Protection

Digital Voice Protection (DVP) is a proprietary

speech encryption technique used by Motorola for

their higher-end secure communications products.

DVP is considered to be very secure.

(D) PGPFone

PGPfone is another offering from Pretty Good

Privacy Inc., a secure voice program for the PC.

The interface is pleasantly intuitive, and there are

options for different encoders and decoders (for

either cellphone or landline use). PGPfone offers

a selection of encryption schemes: 128 bit CAST

key (a DES-like crypto system), 168 bit Triple-

DES key (estimated key strength is 112 bits) or

http://web.mit.edu/network/pgpfone/pgpfone-form.html


SIP0506-4

192 bit Blowfish key (unknown estimated key

strength).

(E) Nautilus Nautilus is a free secure communications

program. Its lacks many of the features of other

communications programs, and its interface is

best described as user-hateful. Unlike most other

voice encryption programs, Nautilus uses a

proprietary algorithm with a key negotiated by the

Diffie-Hellman Key Exchange.

(F) Speak Freely

Speak Freely is a versatile, simple voice

encryption system. Speak Freely offers a selection

of voice encryption techniques (IDEA or DES).

Speak Freely also permits conferencing, and

contains several other useful functions. Unlike

most voice encryption platforms, Speak Freely

includes options that it to connect to other

encrypting and non-encrypting internet

telephones.

(G) SEU-8201 Cipher system

The SEU-8201 is a high-security voice ciphering

system which is mainly used for authorities,

governmental agencies, police and military or

paramilitary. The ciphering algorithm is a new

approach, providing the highest security needed

for such user groups. From a practical standpoint,

it is not susceptible to attack by eavesdroppers or

by using current crypto-analytical methods [4].

.

Fig 4: SEU-8201 Voice Encryption System

V. CONCLUSION

The Speech Scrambling and Encryption is an ambivalent technique for voice security and plays an important role in the field of voice communication. Speech Scrambling and Encryption technique describes the enhanced security of voice communication due to large number of complex operations to convert the sound wave from original one to scramble wave format, which is very difficult to convert into original format for any unauthorized third party. The advantage of Speech Scrambling and Encryption is that it provides better security because even if transmitted wave is accessed by the intruder, the confidentiality of original wave can still be maintained by the speech scramble and encryption technique. The study of speech scrambling and encryption technique aims to enhance the potential of upcoming communication technologies and its implications to defense and government users. The implementation of voice scrambling and encryption technique is a strong and positive move in the way of defining a standard for secure voice communication. However, as the amount of confidential voice communication increases over the insecure wireless channel, speech scrambling and encryption must also be reviewed from a security prospective.

VI. REFERENCES

1. Weblink:http://history.sandiego.edu/gen/re

cording/sigsaly.html, “SIGSALY”

2. Owens, F. J. (1993). Signal Processing of Speech. Houndmills: MacMillan Press. ISBN 0333519221.

3. Weblink:http://seussbeta.tripod.com/crypt.

html#SCRAMBLE

4. Accessed e-Link: http://vhf-encryption.at-communication.com/ en/secure/seu_8201.html

http://www.lila.com/nautilus

http://www.speakfreely.org/


SIP0506-5

Documents

SIP