60
1 Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic method, which is the basis for stochastic gradient based methods. This is a feed back approach to finding the minimum of the error performance surface - error surface must be know - adaptive approach converses to the optimal solution, p R w 1 0 - = without inverting matrix

Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

1

Adaptive Filtering Method of Steepest

Descent

Steepest descent is an old, deterministic

method, which is the basis for stochastic

gradient based methods.

This is a feed back approach to finding the

minimum of the error performance surface

- error surface must be know

- adaptive approach converses to the

optimal solution, pRw 10

−= without

inverting matrix

Page 2: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

2

• x(n) are the WSS input samples• d(n) is the WSS desired output

• )(ˆ nd is the estimate of the desiredsignal given by

)()()(ˆ nnnd H xw=

where TMnxnxnxn )]1(),...,1(),([)( +−−=x

and T

M nwnwnwn )](),...,(),([)( 110 −=w is the

filter weight vector at time n.

Page 3: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

3

Then

)()()(

)(ˆ)()(

nnnd

ndndneH xw−=

−=

Thus the MSE of time n is

)()()()(

|)(|)(2

2

nnnn

neEnJHHH

d Rwwwppw +−−==

σwhere

2dσ - variance of desired signal

p – Cross-correlation between x(n) and d(n)

R –correlation matrix of x(n)

When w(n) is set to the (optimal) Wiener

solution, then

pRww 10)( −==n

and

02

min)( wpHdJnJ −== σ

Page 4: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

4

Hence, in order to iteratively find w0, we usethe method of steepest descent. To illustratethis concept, let M = 2, in the 2-D spacedw(n), the MSE forms a Bowl-shapedfunction.

A contour of the MSE is given as

Thus, if we are at a specific point in theBowl, we can imagine dropping a marble. Itwould reach the minimum. By goingthrough the path of steepest descent.

w0

w

J(w)

J(w)

w1

w2

w0

w2

w1

)(J∇−

2w

J

∂∂−

1w

J

∂∂−

Page 5: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

5

Hence the direction in which we change thefilter direction is )(nJ∇− , or

)]([2

1)()1( nJnn −∇+=+ µww

or, since )(22))(( nnJ Rwp +−=∇

)]([)()1( nnn Rwpww −+=+ µ

for n=0, 1, 2, … and where µ is called the

stepsize and

) generalin ( )0( 0w =

Stability: Since the SD method uses

feedback, the system can go unstable

• bounds on the step size guaranteeing

stability can be determined with respect

ot the eigenvalues of R (widrow, 1970)

Page 6: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

6

Define the error vector for the tap weights as

0)()( wwc −= nn

Then using 0Rwp = in the update,

)()(

)]([)(

)]([)()1(

0

nn

nn

nnn

Rcw

RwRww

Rwpww

µµµ

−=−+=

−+=+

and

)(][

)()()1(

)()()1( 00

n

nnn

or

nnn

cRI

Rccc

Rcwwww

µµ

µ

−=−=+

−−=−+

Page 7: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

7

Using the Unitary Similarity Transform

HQQR =

we have

)(][)1( nn H cQQIc µ−=+

Premultiplying by QH gives

)(][

)(][)1(

n

nn

H

HHHH

cQI

cQQQQcQ

µ

µ

−=

−=+

Define the transformed coefficients as

))((

)()(

0wwQ

cQv

−==

n

nnH

H

Page 8: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

8

Then

)(][)1( nn vIv µ−=+

with initial condition

00 ))0(()0( wQwwQv HH −=−=

if 0w =)0(

The kth term in v(n+1) (mode) is given by

Mknvnv kkk ,...,2,1)()1()1( =−=+ µλ

or using the recursion

)0()1()( kn

kk vnv µλ−=

Thus for all 0)(lim =∞→

nvkn we must have

1|1| <− kµλ for all k, or max

20

λµ <<

Page 9: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

9

The kth mode has geometric decay

)0()1()( kn

kk vnv µλ−=

we can characterize the rate of decay by

finding the time it takes to decay to e-1 of the

initiative. Thus

11

)1ln(

1

)0()0()1()( 1

<<≈−−=

=−= −

µµλµλ

τ

µλτ τ

for

vevv

kkk

kkkkkk

The overall rate of decay is

)1ln(

1

)1ln(

1

minmax µλτ

µλ −−≤≤

−−

Page 10: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

10

Recall that

=

=

−+=

+=

+=

−−+=

−−+=

M

kk

nkk

M

kkk

H

HH

H

vJ

nvJ

nnJ

nnJ

nnJnJ

1

22min

1

2min

min

00min

00min

|)0(|)1(

|)(|

)()(

))(())((

))(())(()(

µλλ

λ

Yv

wwQQww

wwRww

Thus min)(lim JnJn

=∞→

Page 11: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

11

Example: Consider a two-tap predictor

Consider the effects of the following cases

• Varying the eigenvalue spread

min

max)(λλχ =R

and keeping µ fixed.

• Varying µ and keeping the eigenvalue

spread )(Rχ fixed

Page 12: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

12

Page 13: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

13

Page 14: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

14

Page 15: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

15

Page 16: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

16

Page 17: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

17

Page 18: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

18

Page 19: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

19

Example: Consider the system identification

problem

For M=2 suppose

=

=

5.0

8.0

18.0

8.01PRx

w(n) system

x(n)

)(ˆ nd d(n)+_

e(n)

Page 20: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

20

From eigen analysis we have

λ1 = 1.8, λ2 = 0.2 and 8.1

2<µ

also

=

=

1

1

2

1

1

1

2

121 qq

and

=11

11

2

1Q

Also,

== −

389.0

11.110 pRw

Thus])([)( 0wwQv −= nn H

Noting that

=

−=−=06.1

51.0

389.0

11.1

11

11

2

1)0( 0wQv H

Page 21: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

21

and51.0))8.1(1()(1

nnv µ−=

06.1))2.0(1()(2nnv µ−=

Page 22: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

22

The Least Mean Square (LMS) Algorithm

The error performance surface used by theSD method is not always known a priori.We can use estimated values. The estimatesare RVs and thus this leads to a stochasticapproach.

We will use the following instantaneousestimates

)()()(ˆ nnn HxxR =)()()(ˆ ndnn ∗= xP

Page 23: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

23

Recall the SD update

[ ]))((2

1)()1( nJnn ∇−+=+ µww

where the gradient of the error surface atw(n) was shown to be

)(22))(( nnJ Rwp +−=∇

Using the instantaneous estimates,

)()(2

)](ˆ)()[(2

)]()()()[(2

)()()(2)()(2))((ˆ

nen

ndndn

nnndn

nnnndnnJH

H

∗∗

−=−−=

−−=+−=∇

x

x

wxx

wxxx

Complex conjugate ofestimate error

Page 24: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

24

Putting this in the update

)()()()1( nennn ∗+=+ xww µ

Thus LMS algorithm belongs to the family

of stochastic gradient algorithms.

The update is extremely simple while the

instantaneous estimates may have large

variance, the LMS algorithm is recursive

and effectively averages these estimates.

The simplicity and good performance of the

LMS algorithm make it the benchmark

against which other optimization algorithms

are judged.

Page 25: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

25

The LMS algorithm can be analyzed by

invoking the independence theory, which

states

1) The vectors x(1), x(2), …, x(n) are

statistically independent.

2) x(n) is independent of d(1), d(2), …,

d(n-1)

3) d(n) is statistically dependent on x(n),

but is independent of d(1), d(2), …, d(n-1)

4) x(n) and d(n) are mutually Gaussian.

The independence theorem is justified in

some cases, e.g. beamforming where we

receive independent vector observations. In

other cases it is not well justified, but allows

the analysis to proceeds.

Page 26: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

26

Using the independence theory we can show

that w(n) converges to the optimal solution

in the mean

0)(lim ww =∞→

nEn

In certain cases, to show this, evaluate the

update

)()()()1( nennn ∗+=+ xww µ

)()()()1( 00 nennn ∗+−=−+ xwwww µ

])()()[(

)()]()([

)()()()()(

)()()(

])()[()(

)()()(

))()()()(()()1(

0

0

00

wxx

cxxI

wxxcxx

xc

wwwxx

xc

wxxcc

nndn

nnn

nnnnn

ndnn

nnn

ndnn

nnndnnn

H

H

HH

H

H

−+

−=

−−

+=

+−−+=

−+=+

µµ

µµµ

µµµ

Page 27: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

27

Note that since w(n) is based on past inputs

desired responses, w(n) (and c(n)) is

independent of x(n)

Thus

)()()1(

)()()()()1(

)()()()]()([)1(

why?zero, is This

0

0

nEnE

nenEnEnE

nennnnn H

cRIc

xcRIc

xcxxIc

µ

µµ

µµ

−=+

+−=+⇓

+−=+

4434421

Using arguments similar to the SD case we

have

max

20 if0)(lim

λµ <<=

∞→nE

nc

or equivalently

max0

20 if)(lim

λµ <<=

∞→ww nE

n

Page 28: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

28

Noting that

2max )0(][ xNNrtrace σλ ==≤ R

a more conservative bound is

2

20

xNσµ <<

Also, convergence in the mean

0)(lim ww =∞→

nEn

is a weak condition that says nothing about

the variance, which may even grow.

A stronger condition is convergence in the

mean square, which says

constant|)(|lim 2 =∞→

nEn

c

Page 29: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

29

An equivalent condition is to show that

constant|)(|lim)(lim 2 ==∞→∞→

neEnJnn

write e(n) as

)()()(

)()()()(

)()()()(ˆ)()(

0

0

nnne

nnnnd

nnndndndne

H

HH

H

xc

xcxw

xw

−=

−−=−=−=

Thus

)(

)()()()(

))()()())(()()((

|)(|)(

min

)(

min

00

2

nJJ

nnnnEJ

nnnennneE

neEnJ

ex

nJ

HH

HH

ex

+=

+=

−−==

4444 34444 21cxxc

cxxc

Page 30: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

30

Since Jex(n) is a scalar

)]()()()([

)]()()()([

)]()()()([

)()()()()(

nnnnEtrace

nnnntraceE

nnnntraceE

nnnnEnJ

HH

HH

HH

HHex

ccxx

ccxx

cxxc

cxxc

===

=

Invoking the independence theorem

)]([

)]()()()([)(

ntrace

nnEnnEtracenJ HHex

RK

ccxx

==

where

)()()( nnEn HccK =

Page 31: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

31

Thus

)]([

)()(

min

min

ntraceJ

nJJnJ ex

RK+=+=

Recall

HH or QQRRQQ ==

Let

)()( nnH SQKQ∆=

where S(n) need not be diagonal. Then

Hnn QQSK )()( = and

)]([

)]([

])([

])([

)]([)(

ntrace

ntrace

ntrace

ntrace

ntracenJ

H

H

HH

ex

6

SQQ

QSQ

QQSQQ

RK

=

=

=

=

=

Page 32: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

32

Since Ω is diagonal

∑=

==M

iiiex nsntracenJ

1

)()]([)( λ6

where s1(n), s2(n), …, sM(n) are the diagonal

elements of S(n).

The recursion expression can be modified to

yield a recursion on S(n), which is

ISIS min2))(()()1( Jnn µµµ +−−=+

which for the diagonal elements is

MiJnsns iiii ,..,2,1)()1()1( min22 =+−=+ λµµλ

Suppose Jex(n) converges, then

)()1( nsns ii =+ and from the above

MiJ

JJns

i

ii

i

i

ii

,...,2,12

2)1(1)(

min

22min

2

2min

2

=−

=

−=

−−=

µλµ

λµµλλµ

µλλµ

Page 33: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

33

Utilizing

∑=

==M

iiiex nsntracenJ

1

)()]([)( λ6

we see

∑=∞→ −

=M

i i

iex

nJnJ

1min 2

)(limµλ

µλ

The LMS misadjustment is defined

∑=

∞→

−==

M

i i

iex

n

J

nJM

1min 2

)(lim

µλµλ

A misadjustment at 10% or less is generally

considered acceptable.

Page 34: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

34

Example: one tap predictor of order one AR

process. Let

)()1()( nvnaxnx +−−=and use a one tap predictor.

The weight update is

)]1()()()[1()(

)()1()()1(

−−−+=−+=+

nxnwnxnxnw

nenxnwnw

µµ

Note aw −=0 consider two cases and set

05.0=µ a 2xσ

-0.99 0.936270.99 0.995

Page 35: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

35

Page 36: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

36

Page 37: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

37

Page 38: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

38

Consider the expected trajectory of w(n).

Recall

)()1()()]1()1(1[

)]1()()()[1()(

)()1()()1(

nxnxnwnxnx

nxnwnxnxnw

nenxnwnw

−+−−−=−−−+=

−+=+

µµµµ

Since )()1()( nvnaxnx +−−=

)()1()1()1(

)()]1()1(1[

)]()1()[1(

)()]1()1(1[)1(

nvnxnxnax

nwnxnx

nvnaxnx

nwnxnxnw

−+−−−−−−=

+−−−+−−−=+

µµµ

µµ

Taking the expectation and invoking the

dependence theorem

anwEnwE xx22 )()1()1( µσµσ −−=+

Page 39: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

39

Page 40: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

40

We can also derive a theoretical expression

for J(n).

Note that the initial value of J(n) is

2)0( xJ σ=

and the final value is

1

1min

2min 2

)(µλ

µλσ−

+=+=∞ JJJJ vex

if µ small

+=

+=∞

21

2)(

22

222 x

vx

vvJµσσµσσσ

Also, the time constant is

)2

1()1)](2

1([)(

2

1

)1ln(2

1

)1ln(2

1

2222222

221

xvn

xxvx

xx

nJ σµσµσσµσσ

µσµσµλτ

++−+−=

≈−

−=−

−=

Page 41: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

41

Page 42: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

42

Example: Adaptive equalization

Goal: Pass known signal through unknown

channel to invert effects of channel and

noise on signal.

Page 43: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

43

The signal is a Bernouli sequence

−+

=1/2y probabilitwith 1

1/2y probabilitwith 1nx

The channel has a raised cosine response

=−+

=otherwise 0

3,2,1n))]2(2

cos(1[2

1n

whn

π

Note that w controls the eigenvalue spread

)(Rχ .

Also the additive noise is ∼N(0, 0.001)

Note that hn is symmetric about n=2 and

thus introduces a delay of 2. We will use an

M=11 tap filter, which will be symmetric

about n=5 and introduce a delay of 5.

Thus an overall delay of δ=5+2=7 is added

to the system.

Page 44: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

44

Channel response and Filter response

Consider three w values

Note step size is bound by w=3.5 case

14.0)3022.1(11

2

)0(

2 ==≤Nr

µ

Choose µ=0.075 in all cases.

Page 45: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

45

Page 46: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

46

Page 47: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

47

Page 48: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

48

Example: Directionality of the LMS

algorithm

• The speed of convergence of the LMS

algorithm is faster in certain directions in

the weight space.

• If the convergence is in the appropriate

direction, the convergence can be

accelerated by increased eigenvalue

spread.

Consider the deterministic signal

)cos()cos()( 2211 nAnAnx ωω +=

with

++++

=22

212

221

21

2221

21

22

21

)cos()cos(

)cos()cos(

21

AAAA

AAAA

ωωωω

R

Page 49: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

49

which gives

))cos(1(2

1))cos(1(

2

1

))cos(1(2

1))cos(1(

2

1

2221

212

2221

211

ωωλ

ωωλ

−+−=

+++=

AA

AA

and

−=

=

1

1

1

121 qq

Consider two cases:

9.12)( with )23.0cos(5.0)6.0cos()(

9.2)( with )1.0cos(5.0)2.1cos()(

=+==+=

R

R

χχ

nnnx

nnnx

b

a

In each case let

==⇒=⇒=

1

11011011 qwqRwqp λλ and

−==⇒=⇒=

1

12022022 qwqRwqp λλ

Look at 200 iterations of the algorithm.

Look at minimum eigenfilter, first

Then maximum eigenfilter,

−==

1

120 qw

==

1

110 qw

Page 50: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

50

Page 51: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

51

Page 52: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

52

Normalized LMS Algorithm

In the standard LMS algorithm the

correction is proportional to )()( nen ∗xµ

)()()()1( nennn ∗+=+ xww µ

If x(n) is large, the update suffers from

gradient noise amplification. The

normalized LMS algorithm seeks to avoid

gradient noise amplification

• The step size is made time varying, µ(n),

and optimized to minimize error.

Page 53: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

53

Thus let

)]()[()(

)]()[(2

1)()1(

nnn

nnnn

Rwpw

ww

−+=

−∇+=+

µ

µ

Choose µ(n), such that the updated w(n+1)

produces the minimum MSE,

|)1(|)1( 2+=+ neEnJ

where

)1()1()1()1( ++−+=+ nnndne H xw

Thus we choose µ(n) such that it minimizes

J(n+1).

The optimal step size, µ0(n), will be a

function of R and ∇(n). As before, we use

instantaneous estimates of these values.

Page 54: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

54

To determine µ0(n), expand J(n+1)

)1()1(

)1()1(

))1()1()1((

))1()1()1((

)1()1()1(

2

+++

+−+−=++−+

++−+=++=+

nn

nn

nnnd

nnndE

neneEnJ

H

HHd

H

H

Rww

wppw

wx

xw

σ

Now use the fact that )()(21

)()1( nnnn ∇−=+ µww

44444444 344444444 21

)()()(4

1)()()(

2

1

)()()(2

1)()(

2

2

)()(2

1)()()(

2

1)(

)()(2

1)(

)()(2

1)()1(

nnnnnn

nnnnn

H

H

H

d

HH

HH

nnnnnn

nnn

nnnnJ

∇∇+∇−

∇−=

∇−

∇−+

∇−−

∇−−=+

RRw

RwRww

wRw

wp

pw

µµ

µ

µµ

µ

µσ

Page 55: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

55

)()()(4

1)()()(

2

1

)()()(2

1)()(

)()(2

1)(

)()(2

1)()1(

2

2

nnnnnn

nnnnn

nnn

nnnnJ

HH

HH

H

H

d

∇∇+∇−

∇−+

∇−−

∇−−=+

RRw

RwRww

wp

pw

µµ

µ

µ

µσ

Differentiating with respect to µ(n),

)()()(2

1)()(

2

1

)()(2

1)(

2

1)(

2

1

)(

)1(

nnnnn

nnnnn

nJ

HH

HHH

∇∇+∇−

∇−∇+∇=∂

+∂

RRw

Rwpp

µ

µ

Setting equal to 0

pRw

pRwR

)()()(

)()()()()()(0

nnn

nnnnnnHH

HHH

∇−∇+

∇−∇=∇∇µ

)()(

])()[()(])([

)()(

])()[()(])([)(0

nn

nnnn

nn

nnnnn

H

HH

H

HHH

∇∇−∇+∇−=

∇∇−∇+∇−=

RpRwpRw

RpRwpRwµ

Page 56: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

56

)()(

)()(

)()(

)()(21

)()(21

)(0

nn

nn

nn

nnnnn

H

H

H

HH

∇∇∇∇=

∇∇

∇∇+∇∇=

R

Using instantaneous estimates

)()(2

))]()(ˆ)(([2

)]()()()()([2)(ˆ

)()(ˆ

nen

ndndn

ndnnnnn

nnH

H

∗∗

−=−=

−=∇

=

x

x

xwxx

xxR

Thus

2

22

2

0

||)(||

1

)()(

1

))()((|)(|

)()(|)(|

)()(2)()()()(2

)()()()(4)(

nnn

nnne

nnne

nennnnen

nennenn

H

H

H

HH

H

xxx

xxxx

xxxxxx

==

=

= ∗

µ

Page 57: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

57

Thus the NLMS update is

)()(||)(||

~)()1(

)(

2nen

nnn

n

∗+=+ xx

ww43421

µ

µ

To avoid problems when 0||)(|| 2 ≈nx we add

an offset

)()(||)(||

~)()1(

2nen

nann ∗

++=+ x

xww

µ

where a > 0.Consider now the convergence of the NLMS

algorithm.

)()(||)(||

~)()1(

2nen

nnn ∗+=+ x

xww

µ

substituting )()()()( nnndne H xw−=

22

2

||)(||

)()(~)(]||)(||

)()(~[

)]()()()[(||)(||

~)()1(

n

ndnn

n

nn

nnndnn

nn

H

H

xx

wx

xx

wxxx

ww

+−=

−+=+

µµ

µ

Page 58: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

58

Compare NLMS and LMS:

NLMS:

22 ||)(||

)()(~)(]||)(||

)()(~[)1(n

ndnn

n

nnn

H

xx

wx

xxw

+−=+ µµ

LMS:

)()()()]()([)1( ndnnnnn H ∗+−=+ xwxxw µµComparing we see the following

corresponding terms

LMS NLMS

µ µ~

)()( nn Hxx2||)(||

)()(

n

nn H

xxx

)()( ndn ∗x2||)(||

)()(

n

ndn

xx ∗

Page 59: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

59

Since in the LMS case

][

2

)]()([

20

Rxx tracennEtrace H=<< µ

guarantees stability by analogy,

the NLMS condition is

]||)(||

)()([

2~0

2

<<

n

nnEtrace

H

xxx

µ

make the following approximation

||)(||

)()(

||)(||

)()(22 nE

nnE

n

nnE

HH

xxx

xxx ≈

Then

||)(||

)]()([

||)(||

)]()([

||)(||

)]()([

||)(||

)()(

22

22

nE

nntraceE

nE

nntraceE

nE

nnEtrace

n

nnEtrace

HH

HH

xxx

xxx

xxx

xxx

==

=

Page 60: Adaptive Filtering Method of Steepest Descent method, which is … · 2000-02-20 · Adaptive Filtering Method of Steepest Descent Steepest descent is an old, deterministic ... of

60

1||)(||

)()(

||)(||

)()(22

==

nE

nnE

n

nnEtrace

HH

xxx

xxx

Thus the NLMS update

)(||)(||

)(~)()1(2

nen

nnn ∗+=+

xx

ww µ

will converge if 2~0 << µ .

• The NLMS has a simpler convergence

criteria than the LMS

• The NLMS generally converges faster

than the LMS algorithm