Pdf8 Steepest Descent

Embed Size (px)

Citation preview

  • 8/19/2019 Pdf8 Steepest Descent

    1/60

  • 8/19/2019 Pdf8 Steepest Descent

    2/60

    2

    • { x(n)} are the WSS input samples

    {d (n)} is the WSS desired output• )}(ˆ{ nd is the estimate of the desired

    signal given by)()()(ˆ nnnd H xw=

    whereT M n xn xn xn

    )]1(),...,1(),([)( +−−=x

    andT

    M nwnwnwn )](),...,(),([)( 110 −=w is thefilter weight vector at time n.

  • 8/19/2019 Pdf8 Steepest Descent

    3/60

    3

    Then

    )()()(

    )(ˆ)()(

    nnnd

    nd nd ne H xw−=

    −=

    Thus the MSE of time n is

    )()()()(

    }|)({|)(

    2

    2

    nnnn

    ne E n J

    H H H d Rwwwppw +−−=

    where2d σ - variance of desired signal

    p – Cross-correlation between x(n) and d (n)

    R –correlation matrix of x(n)

    When w(n) is set to the (optimal) Wiener

    solution, then

    pRww1

    0)( −

    ==nand

    02

    min)( wp H

    d J n J −== σ

  • 8/19/2019 Pdf8 Steepest Descent

    4/60

    4

    Hence, in order to iteratively find w0, we usethe method of steepest descent. To illustrate

    this concept, let M = 2, in the 2-D spacedw(n), the MSE forms a Bowl-shapedfunction.

    A contour of the MSE is given as

    Thus, if we are at a specific point in theBowl, we can imagine dropping a marble. Itwould reach the minimum. By goingthrough the path of steepest descent.

    w0w

    J w

    J w

    w1

    w2

    w0

    w2

    w1

    )( J ∇−

    2w J

    ∂∂−

    1w J

    ∂∂−

  • 8/19/2019 Pdf8 Steepest Descent

    5/60

    5

    Hence the direction in which we change thefilter direction is )(n J ∇− , or

    )]([21)()1( n J nn −∇+=+ µ ww

    or, since )(22))(( nn J Rwp +−=∇)]([)()1( nnn Rwpww −+=+

    for n=0, 1, 2, … and where µ is called the

    stepsize and

    )generalin( )0( 0w =

    Stability: Since the SD method usesfeedback, the system can go unstable

    • bounds on the step size guaranteeingstability can be determined with respect

    ot the eigenvalues of R (widrow, 1970)

  • 8/19/2019 Pdf8 Steepest Descent

    6/60

    6

    Define the error vector for the tap weights as

    0)()( wwc −= nn

    Then using 0Rwp = in the update,

    )()(

    )]([)(

    )]([)()1(

    0

    nn

    nn

    nnn

    Rcw

    RwRww

    Rwpww

    µ

    µ

    −=

    −+=−+=+

    and

    )(][)()()1(

    )()()1( 00

    nnnn

    or

    nnn

    cRIRccc

    Rcwwww

    µ µ

    µ

    −=−=+

    −−=−+

  • 8/19/2019 Pdf8 Steepest Descent

    7/60

    7

    Using the Unitary Similarity Transform H QQR =

    we have

    )(][)1( nn H cQQIc µ −=+

    Premultiplying by Q H gives

    )(][

    )(][)1(

    n

    nn

    H

    H H H H

    cQI

    cQQQQcQ

    µ

    µ

    −=

    −=+

    Define the transformed coefficients as

    ))((

    )()(

    0wwQ

    cQv

    −==

    n

    nn H

    H

  • 8/19/2019 Pdf8 Steepest Descent

    8/60

    8

    Then

    )(][)1( nn vIv −=+with initial condition

    00 ))0(()0( wQwwQv H H −=−=

    if 0w =)0(The k th term in v(n+1) (mode) is given by

    M k nvnv k k k ,...,2,1)()1()1( =−=+ λ

    or using the recursion

    )0()1()( k n

    k k vnv µλ −=Thus for all 0)(lim =∞→ nvk n we must have

    1|1|

  • 8/19/2019 Pdf8 Steepest Descent

    9/60

    9

    The k th mode has geometric decay

    )0()1()( k n

    k k vnv µλ −=we can characterize the rate of decay by

    finding the time it takes to decay to e-1 of the

    initiative. Thus

    11

    )1ln(1

    )0()0()1()( 1

  • 8/19/2019 Pdf8 Steepest Descent

    10/60

    10

    Recall that

    =

    =

    −+=

    +=

    +=

    −−+=−−+=

    M

    k k

    n

    k k

    M

    k k k

    H

    H H

    H

    v J

    nv J

    nn J

    nn J

    nn J n J

    1

    22

    min

    1

    2min

    min

    00min

    00min

    |)0(|)1(

    |)(|

    )()(

    ))(())((

    ))(())(()(

    µλ λ

    λ

    v

    wwQQww

    wwRww

    Thus min)(lim J n J n =∞→

  • 8/19/2019 Pdf8 Steepest Descent

    11/60

    11

    Example: Consider a two-tap predictor

    Consider the effects of the following cases

    • Varying the eigenvalue spread

    min

    max)(λ

    χ =R and keeping µ fixed.

    • Varying µ and keeping the eigenvalue

    spread )(R fixed

  • 8/19/2019 Pdf8 Steepest Descent

    12/60

    12

  • 8/19/2019 Pdf8 Steepest Descent

    13/60

    13

  • 8/19/2019 Pdf8 Steepest Descent

    14/60

    14

  • 8/19/2019 Pdf8 Steepest Descent

    15/60

    15

  • 8/19/2019 Pdf8 Steepest Descent

    16/60

    16

  • 8/19/2019 Pdf8 Steepest Descent

    17/60

    17

  • 8/19/2019 Pdf8 Steepest Descent

    18/60

    18

  • 8/19/2019 Pdf8 Steepest Descent

    19/60

    19

    Example: Consider the system identification

    problem

    For M =2 suppose

    ==5.0

    8.0

    18.0

    8.01PR x

    w(n) system

    { x(n)}

    )(ˆ nd d (n)+_

    e(n)

  • 8/19/2019 Pdf8 Steepest Descent

    20/60

    20

    From eigen analysis we have

    λ 1 = 1.8, λ 2 = 0.2 and 8.12

    < µ also

    −== 11

    2

    11

    1

    2

    121 qq

    and

    −= 1111

    2

    1Q

    Also,

    −== −

    389.011.11

    0 pRw

    Thus])([)( 0wwQv −= nn H

    Noting that=−−−=−= 06.1

    51.0

    389.0

    11.1

    11

    11

    2

    1)0( 0wQv

    H

  • 8/19/2019 Pdf8 Steepest Descent

    21/60

    21

    and51.0))8.1(1()(1

    nnv µ −=

    06.1))2.0(1()(2nnv µ −=

  • 8/19/2019 Pdf8 Steepest Descent

    22/60

    22

    The Least Mean Square (LMS) Algorithm

    The error performance surface used by theSD method is not always known a priori.We can use estimated values. The estimatesare RVs and thus this leads to a stochasticapproach.

    We will use the following instantaneousestimates

    )()()(ˆ nnn H xxR =)()()(ˆ nd nn ∗=xP

  • 8/19/2019 Pdf8 Steepest Descent

    23/60

    23

    Recall the SD update

    [ ]))((2

    1)()1( n J nn ∇−+=+ µ ww

    where the gradient of the error surface atw(n) was shown to be

    )(22))(( nn J Rwp +−=∇Using the instantaneous estimates,

    )()(2)](ˆ)()[(2

    )]()()()[(2

    )()()(2)()(2))((ˆ

    nen

    nd nd n

    nnnd n

    nnnnd nn J H

    H

    ∗∗

    −=−−=

    −−=+−=∇

    x

    x

    wxx

    wxxx

    Complex conjugate of estimate error

  • 8/19/2019 Pdf8 Steepest Descent

    24/60

    24

    Putting this in the update

    )()()()1( nennn ∗+=+ xww µ Thus LMS algorithm belongs to the family

    of stochastic gradient algorithms.

    The update is extremely simple while theinstantaneous estimates may have large

    variance, the LMS algorithm is recursive

    and effectively averages these estimates.

    The simplicity and good performance of the

    LMS algorithm make it the benchmark

    against which other optimization algorithms

    are judged.

  • 8/19/2019 Pdf8 Steepest Descent

    25/60

    25

    The LMS algorithm can be analyzed by

    invoking the independence theory, which

    states

    1) The vectors x(1), x(2), …, x(n) are

    statistically independent.

    2)

    x(n) is independent of d (1), d (2), …,d (n-1)

    3) d (n) is statistically dependent on x(n),

    but is independent of d (1), d (2), …, d (n-1)

    4) x(n) and d (n) are mutually Gaussian.The independence theorem is justified in

    some cases, e.g. beamforming where we

    receive independent vector observations. In

    other cases it is not well justified, but allows

    the analysis to proceeds.

  • 8/19/2019 Pdf8 Steepest Descent

    26/60

    26

    Using the independence theory we can show

    that w(n) converges to the optimal solution

    in the mean

    0)}({lim ww =∞→ n E nIn certain cases, to show this, evaluate the

    update

    )()()()1( nennn ∗+=+ xww µ

    )()()()1( 00 nennn ∗+−=−+ xwwww µ

    ])()()[(

    )()]()([)()()()()(

    )()()(

    ])()[()(

    )()()())()()()(()()1(

    0

    0

    00

    wxx

    cxxIwxxcxx

    xc

    wwwxx

    xc

    wxxcc

    nnd n

    nnnnnnnn

    nd nn

    nnn

    nd nn

    nnnd nnn

    H

    H

    H H

    H

    H

    −+−= −−

    +=+−−

    += −+=+

    µ

    µ µ µ

    µ

    µ

    µ µ

  • 8/19/2019 Pdf8 Steepest Descent

    27/60

    27

    Note that since w(n) is based on past inputs

    desired responses, w(n) (and c(n)) is

    independent of x(n)

    Thus

    )}({)()}1({

    )}()({)}({)()}1({

    )()()()]()([)1(

    why?zero,isThis

    0

    0

    n E n E

    nen E n E n E

    nennnnn H

    cRIc

    xcRIc

    xcxxIc

    µ

    µ µ

    µ µ

    −=+

    +−=+⇓

    +−=+

    Using arguments similar to the SD case we

    have

    max

    20if 0)}({lim

    λ µ

  • 8/19/2019 Pdf8 Steepest Descent

    28/60

    28

    Noting that2

    max )0(][ x N Nr trace σ λ ==≤ Ra more conservative bound is

    2

    20

    x N σ µ

  • 8/19/2019 Pdf8 Steepest Descent

    29/60

    29

    An equivalent condition is to show that

    constant}|)({|lim)(lim 2 ==∞→∞→

    ne E n J nn

    write e(n) as

    )()()(

    )()()()(

    )()()()(ˆ)()(

    0

    0

    nnne

    nnnnd

    nnnd nd nd ne

    H

    H H

    H

    xc

    xcxw

    xw

    −=

    −−=−=−=

    Thus

    )(

    )}()()()({

    ))}()()())(()()({(

    }|)({|)(

    min

    )(

    min

    00

    2

    n J J

    nnnn E J

    nnnennne E

    ne E n J

    ex

    n J

    H H

    H H

    ex

    +=

    +=−−=

    =∗

    cxxc

    cxxc

  • 8/19/2019 Pdf8 Steepest Descent

    30/60

    30

    Since J ex(n) is a scalar

    )}]()()()({[

    )]}()()()([{

    )]}()()()([{

    )}()()()({)(

    nnnn E trace

    nnnntrace E

    nnnntrace E

    nnnn E n J

    H H

    H H

    H H

    H H ex

    ccxx

    ccxx

    cxxc

    cxxc

    ====

    Invoking the independence theorem

    )]([

    )}]()({)}()({[)(

    ntrace

    nn E nn E tracen J H H exRK

    ccxx

    ==

    where

    )}()({)( nn E n H

    ccK =

  • 8/19/2019 Pdf8 Steepest Descent

    31/60

    31

    Thus

    )]([

    )()(

    min

    min

    ntrace J

    n J J n J exRK+=

    +=

    Recall H H or QQRRQQ ==

    Let

    )()( nn H SQKQ∆=

    where S(n) need not be diagonal. Then H

    nn QQSK )()( = and

    )]([

    )]([

    ])([

    ])([

    )]([)(

    ntrace

    ntrace

    ntrace

    ntrace

    ntracen J

    H

    H

    H H

    ex

    SQQ

    QSQ

    QQSQQ

    RK

    =

    =

    =

    ==

  • 8/19/2019 Pdf8 Steepest Descent

    32/60

    32

    Since Ω is diagonal

    ∑===

    M

    iiiex nsntracen J

    1

    )()]([)( λ

    where s1(n), s2(n), …, s M (n) are the diagonal

    elements of S(n).

    The recursion expression can be modified to

    yield a recursion on S(n), which is

    ISIS min2))(()()1( J nn µ µ µ +−−=+

    which for the diagonal elements is

    M i J nsns iiii ,..,2,1)()1()1( min22 =+−=+ λ µ µλ

    Suppose J ex(n) converges, then

    )()1( nsns ii =+ and from the above

    M i J

    J J

    ns

    i

    ii

    i

    i

    i

    i

    ,...,2,12

    2)1(1)(

    min

    22

    min2

    2

    min2

    =−=−=−−=

    µλ µ

    λ µ µλ λ µ

    µλ λ µ

  • 8/19/2019 Pdf8 Steepest Descent

    33/60

    33

    Utilizing

    ∑=== M

    iiiex nsntracen J 1 )()]([)( λ

    we see

    ∑=∞→ −

    = M

    i i

    iex

    n J n J

    1min 2

    )(lim µλ

    µλ

    The LMS misadjustment is defined

    ∑=

    ∞→

    −== M

    i i

    iex

    n

    J

    n J M

    1min 2

    )(lim

    µλ µλ

    A misadjustment at 10% or less is generally

    considered acceptable.

  • 8/19/2019 Pdf8 Steepest Descent

    34/60

    34

    Example: one tap predictor of order one AR

    process. Let

    )()1()( nvnaxn x +−−=and use a one tap predictor.

    The weight update is

    )]1()()()[1()(

    )()1()()1(

    −−−+=−+=+

    n xnwn xn xnw

    nen xnwnw

    µ

    Note aw −=0 consider two cases and set05.0= a

    2 xσ

    -0.99 0.936270.99 0.995

  • 8/19/2019 Pdf8 Steepest Descent

    35/60

    35

  • 8/19/2019 Pdf8 Steepest Descent

    36/60

    36

  • 8/19/2019 Pdf8 Steepest Descent

    37/60

    37

  • 8/19/2019 Pdf8 Steepest Descent

    38/60

    38

    Consider the expected trajectory of w(n).

    Recall

    )()1()()]1()1(1[

    )]1()()()[1()(

    )()1()()1(

    n xn xnwn xn x

    n xnwn xn xnw

    nen xnwnw

    −+−−−=−−−+=

    −+=+

    µ µ µ

    Since)()1()( nvnaxn x +−−=

    )()1()1()1(

    )()]1()1(1[

    )]()1()[1(

    )()]1()1(1[)1(

    nvn xn xnax

    nwn xn x

    nvnaxn x

    nwn xn xnw

    −+−−−−−−=

    +−−−+−−−=+

    µ µ µ

    µ

    Taking the expectation and invoking the

    dependence theorem

    anw E nw E x x22 )}({)1()}1({ µσ µσ −−=+

  • 8/19/2019 Pdf8 Steepest Descent

    39/60

    39

  • 8/19/2019 Pdf8 Steepest Descent

    40/60

    40

    We can also derive a theoretical expression

    for J (n).

    Note that the initial value of J (n) is2)0( x J σ =

    and the final value is

    1

    1min2min 2)( µλ λ

    σ −+=+=∞ J J J J vexif µ small

    +=

    +=∞

    21

    2)(

    22

    222 x

    v x

    vv J µσ

    σ µσ

    σ σ

    Also, the time constant is

    )

    2

    1()1)](

    2

    1([)(

    21

    )1ln(21

    )1ln(21

    2222222

    221

    xvn

    x xv x

    x x

    n J σ µ

    σ µσ σ µ

    σ σ

    µσ µσ µλ τ

    ++−+−=

    ≈−−=−−=

  • 8/19/2019 Pdf8 Steepest Descent

    41/60

    41

  • 8/19/2019 Pdf8 Steepest Descent

    42/60

    42

    Example: Adaptive equalization

    Goal: Pass known signal through unknown

    channel to invert effects of channel and

    noise on signal.

  • 8/19/2019 Pdf8 Steepest Descent

    43/60

    43

    The signal is a Bernouli sequence

    +=1/2yprobabilitwith1

    1/2yprobabilitwith1n x

    The channel has a raised cosine response

    =−+=otherwise 0

    3,2,1n))]2(2

    cos(1[21

    nwhn

    π

    Note that w controls the eigenvalue spread

    )(R .

    Also the additive noise is ∼ N (0, 0.001)

    Note that hn is symmetric about n=2 and

    thus introduces a delay of 2. We will use an

    M =11 tap filter, which will be symmetric

    about n=5 and introduce a delay of 5.

    Thus an overall delay of δ =5+2=7 is addedto the system.

  • 8/19/2019 Pdf8 Steepest Descent

    44/60

    44

    Channel response and Filter response

    Consider three w values

    Note step size is bound by w=3.5 case

    14.0)3022.1(11

    2)0(

    2 ==≤ Nr

    µ

    Choose µ =0.075 in all cases.

  • 8/19/2019 Pdf8 Steepest Descent

    45/60

    45

  • 8/19/2019 Pdf8 Steepest Descent

    46/60

    46

  • 8/19/2019 Pdf8 Steepest Descent

    47/60

    47

  • 8/19/2019 Pdf8 Steepest Descent

    48/60

    48

    Example: Directionality of the LMS

    algorithm

    • The speed of convergence of the LMSalgorithm is faster in certain directions in

    the weight space.

    If the convergence is in the appropriatedirection, the convergence can be

    accelerated by increased eigenvalue

    spread.

    Consider the deterministic signal

    )cos()cos()( 2211 n An An x +=with

    ++++

    = 22

    212

    221

    21

    2

    2

    21

    2

    1

    2

    2

    2

    1)cos()cos(

    )cos()cos(21

    A A A A

    A A A A

    ω ω ω ω

    R

  • 8/19/2019 Pdf8 Steepest Descent

    49/60

    49

    which gives

    ))cos(1(21

    ))cos(1(21

    ))cos(1(

    2

    1))cos(1(

    2

    1

    2221

    212

    2221

    211

    ω ω λ

    ω ω λ

    −+−=

    +++=

    A A

    A A

    and−==1

    1

    1

    121 qq

    Consider two cases:

    9.12)( with)23.0cos(5.0)6.0cos()(

    9.2)( with)1.0cos(5.0)2.1cos()(

    =+==+=

    R

    R

    χ nnn xnnn x

    b

    a

    In each case let

    ==⇒=⇒=1

    11011011 qwqRwqp λ λ and

    −==⇒=⇒=1

    12022022 qwqRwqp λ λ

    Look at 200 iterations of the algorithm.

    Look at minimum eigenfilter, first

    Then maximum eigenfilter,

    −==1

    120 qw

    == 1110 qw

  • 8/19/2019 Pdf8 Steepest Descent

    50/60

    50

  • 8/19/2019 Pdf8 Steepest Descent

    51/60

  • 8/19/2019 Pdf8 Steepest Descent

    52/60

    52

    Normalized LMS Algorithm

    In the standard LMS algorithm the

    correction is proportional to )()( nen ∗x µ

    )()()()1( nennn ∗+=+ xww µ If x(n) is large, the update suffers from

    gradient noise amplification. The

    normalized LMS algorithm seeks to avoid

    gradient noise amplification

    • The step size is made time varying, µ (n),and optimized to minimize error.

  • 8/19/2019 Pdf8 Steepest Descent

    53/60

    53

    Thus let

    )]()[()()]()[(2

    1)()1(

    nnnnnnn

    Rwpwww

    −+=−∇+=+

    µ µ

    Choose µ (n), such that the updated w(n+1)

    produces the minimum MSE,

    }|)1({|)1( 2+=+ ne E n J where

    )1()1()1()1( ++−+=+ nnnd ne H xwThus we choose µ (n) such that it minimizes

    J (n+1).

    The optimal step size, µ 0(n), will be a

    function of R and ∇(n). As before, we useinstantaneous estimates of these values.

  • 8/19/2019 Pdf8 Steepest Descent

    54/60

    54

    To determine µ 0(n), expand J(n+1)

    )1()1(

    )1()1(

    ))}1()1()1((

    ))1()1()1({(

    )}1()1({)1(

    2

    +++

    +−+−=++−+

    ++−+=++=+

    nn

    nn

    nnnd

    nnnd E

    nene E n J

    H

    H H d

    H

    H

    Rww

    wppw

    wx

    xw

    σ

    Now use the fact that )()(21

    )()1( nnnn ∇−=+ µ ww

    )()()(41

    )()()(21

    )()()(21

    )()(

    2

    2

    )()(21

    )()()(21

    )(

    )()(21

    )(

    )()(

    2

    1)()1(

    nnnnnn

    nnnnn

    H

    H

    H

    d

    H H

    H H

    nnnnnn

    nnn

    nnnn J

    ∇∇+∇−

    ∇−=

    ∇−∇−+

    ∇−−

    ∇−−=+

    RRw

    RwRww

    wRw

    wp

    pw

    µ µ

    µ

    µ µ

    µ

    µ σ

  • 8/19/2019 Pdf8 Steepest Descent

    55/60

    55

    )()()(41

    )()()(21

    )()()(21

    )()(

    )()(21)(

    )()(21

    )()1(

    2

    2

    nnnnnn

    nnnnn

    nnn

    nnnn J

    H H

    H H

    H

    H

    d

    ∇∇+∇−

    ∇−+

    ∇−−

    ∇−−=+

    RRw

    RwRww

    wp

    pw

    µ µ

    µ

    µ

    µ σ

    Differentiating with respect to µ (n),

    )()()(

    2

    1)()(

    2

    1

    )()(21

    )(21

    )(21

    )()1(

    nnnnn

    nnnnn

    n J

    H H

    H H H

    ∇∇+∇−

    ∇−∇+∇=∂+∂

    RRw

    Rwpp

    µ

    µ

    Setting equal to 0

    pRw

    pRwR

    )()()(

    )()()()()()(0nnn

    nnnnnn H H

    H H H

    ∇−∇+∇−∇=∇∇ µ

    )()(])()[()(])([

    )()(])()[()(])([)(0

    nnnnnn

    nnnnnnn

    H

    H H

    H

    H H H

    ∇∇

    −∇+∇−=

    ∇∇−∇+∇−=

    RpRwpRw

    RpRwpRw µ

  • 8/19/2019 Pdf8 Steepest Descent

    56/60

    56

    )()()()(

    )()(

    )()(21

    )()(21

    )(0

    nnnn

    nn

    nnnnn

    H

    H

    H

    H H

    ∇∇

    ∇∇=∇∇

    ∇∇+∇∇=

    R

    R µ

    Using instantaneous estimates

    )()(2

    ))]()(ˆ)(([2

    )]()()()()([2)(ˆ)()(

    ˆ

    nen

    nd nd n

    nd nnnnnnn

    H

    H

    ∗∗

    −=−=

    −=∇ =

    x

    x

    xwxxxxR

    Thus

    2

    22

    2

    0

    ||)(||1

    )()(1

    ))()((|)(|)()(|)(|

    )()(2)()()()(2)()()()(4

    )(

    nnn

    nnnennne

    nennnnennennen

    n

    H

    H

    H

    H H

    H

    xxx

    xxxx

    xxxxxx

    ==

    =

    =∗

    µ

  • 8/19/2019 Pdf8 Steepest Descent

    57/60

    57

    Thus the NLMS update is

    )()(||)(||

    ~)()1(

    )(

    2 nenn

    nn

    n

    ∗+=+ xx

    ww

    µ

    To avoid problems when 0||)(||2 ≈nx we add

    an offset

    )()(||)(||

    ~)()1( 2 nenna

    nn ∗++=+

    xx

    ww

    where a > 0.Consider now the convergence of the NLMS

    algorithm.

    )()(||)(||

    ~)()1( 2 nenn

    nn ∗+=+ xx

    ww

    substituting )()()()( nnnd ne H xw−=

    22

    2

    ||)(||)()(~)(]

    ||)(||)()(~[

    )]()()()[(||)(||

    ~)()1(

    nnd n

    nn

    nn

    nnnd nnnn

    H

    H

    xx

    wx

    xx

    wxxxww

    +−=

    −+=+

    µ µ

  • 8/19/2019 Pdf8 Steepest Descent

    58/60

    58

    Compare NLMS and LMS:

    NLMS:

    22 ||)(||)()(~)(]

    ||)(||)()(~[)1(

    nnd n

    nn

    nnn

    H

    xx

    wx

    xxw

    +−=+ µ µ

    LMS:

    )()()()]()([)1( nd nnnnn H ∗+−=+ xwxxw µ µ Comparing we see the following

    corresponding terms

    LMS NLMS

    µ ~

    )()( nn H xx2||)(||

    )()(n

    nn H

    xxx

    )()( nd n ∗x2||)(||)()(

    nnd n

    xx ∗

  • 8/19/2019 Pdf8 Steepest Descent

    59/60

    59

    Since in the LMS case

    ][

    2

    )}]()({[

    2

    0 Rxx tracenn E trace H =

  • 8/19/2019 Pdf8 Steepest Descent

    60/60

    1}||)({||)}()({

    ||)(||)()(

    22 == n E nn E

    nnn

    E trace H H

    xxx

    xxx

    Thus the NLMS update

    )(||)(||

    )(~)()1( 2 nenn

    nn ∗+=+xx

    ww µ

    will converge if 2~0