Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1
EE290C - Spring 2004Advanced Topics in Circuit DesignHigh-Speed Electrical Interfaces
Lecture #5Adaptive Equalization in High-Speed LinksVladimir StojanovicStanford University and Rambus Inc.2/3/2004
2
System overview
Transmit and Receive Equalization Need to set the equalizer coefficients adaptivelyAlgorithms limited by hardware issues
Very limited precision in the RxTx output swing constraint
The key is minimum hardware cost
Linear transmit equalizer
Decision-feedback equalizer
SampledData
Deadband Feedback taps
Tap SelLogic
TxData
Causaltaps
Anticausal taps
Channel
2
3
Agenda
System overview/requirementsStandard (unconstrained) adaptive equalizationHardware constraint driven adaptationAlternative algorithms/cost functions
4
Adaptive filteringMany algorithms to choose from
Steepest-descentLMS, RLSInterior point method based (most recent, fastest, handle
non-linear cost functions)Typically ranked by convergence speed
Our channel has VERY slow changes so o.k. to use simple, slow algorithms
The key is hardware simplicityLeast Mean Squares (LMS) is one of the simplest algorithms (B. Widrow)
3
5
Goal is to minimize the Mean-Square Error - E(e(n)2 )
Follow the negative gradient of the mean-square error (linear in w)Actually, approximate the mean with instantaneous value
-x (n-∆)
-e (n)x (n)
equalizer wchannel Pu (n)
u (n), e (n)
)(ˆ nx
Adaptive linear Rx EQ
)(2
21 nw
wnn eEww ∇−=+
µ
)(2
)(22
)(ˆ)()()(),()(ˆ
1
21
2
nueww
eww
nueweee
nxnxenxPnunxwnx
nwnn
nww
nn
nn
nnw
n
T
µ
µ
+=
∇−=
−=∂∂=∇
−∆−===
+
+
P
6
Equalizer adapts until received signal u(n) and error e(n) are orthogonal, and any further update is useless.Problem
How to apply the algorithm to our architecture? (transmit equalizer)
-x (n-∆)
-e (n)x (n)
equalizerchannelu (n)
u (n), e (n)
)(ˆ nx
Adaptive linear Rx EQ
nnwnn ueww µ+=+1
4
7
Adaptive Tx equalizer
How do we generate the error signal e(n)?Where do we get u(n)?
-x (n-∆)
-e (n)x (n)
channelequalizery (n)
u (n), e (n)back channel
)(ˆ nx
8
Agenda
System overview/requirementsStandard (unconstrained) adaptive equalizationHardware constraint driven adaptation
Obtaining the error signalAlternative algorithms/cost functions
5
9
Error signal
Hard to generate high-resolution error signalUse the sign-sign variant of the algorithm
Ref. Level (dLev)
Initial eye
errorinitp-p
nn xdLeve −=
nx
slicerthreshold
nx
)( nesign
dLev
slicerthreshold
)( nxsign
dLev
)()(1 nnwnn usignesignstepww +=+
10
Fully parallel error generation (2PAM)
-dlevel
+dlevel
0
0
1
2
)ˆ(xsign
)( Lesign
)( Hesign
x̂
)(xsign
1 0
1
0
)(errorsign
)(datasign
on “live” data
Intput data is 2PAMThresholds of samplers 0 and 2 (+/- dLevel) at reference levels of the received 2PAM signal Samplers 0 and 2 evaluate the signal error with respect to the reference level (dLevel) settingSampler 1 determines the sign of the received data (which is also the received data itself)
2 extra samplers (0,2)
6
11
Fully parallel error generation (4PAM)
4PAM data, need 4 extra samplersIn both cases, huge sampler overhead
-2dlevel/3
+2dlevel/3
0
0
1
2
Ther
mo-co
ded s
ymbo
l valu
e
x̂
-dlevel/3
+dlevel
+dlevel/3
x̂
-dlevel
Erro
r sign
s
Extra 4 samplers (4PAM example)
12
Serialize to minimize the overhead
0 0.5 1 1.5
x 10-10
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
ERR
MSB
x̂
)ˆ(xSign
)( neSign
Just use one reference samplerFilter the updates based on the received data
Data filtering effectively creates an indicator function added to the LMS adaptive algorithm:
ILMS covers the positive halfplane
dlevel_00
0
)ˆ()()0(
1 nnwLMSnn
nLMS
xsignesignstepIwwxI
⋅+=≥=
+
)
00
10
0
7
13
Agenda
System overview/requirementsStandard (unconstrained) adaptive equalizationHardware constraint driven adaptation
Obtaining the error signalMissing signal in Tx Equalization u(n)
Alternative algorithms/cost functions
14
Solution for u(n) – (1) Alternate equalized and unequalizedtransmission
Channel response is pre-computed by the channel itself Simple update:
-x (n-∆)
-e (n)channelx (n)
u (n)back channel
)(ˆ nx
-x (n-∆)
-e (n)x (n)
channelequalizery (n)
u (n), e (n)back channel
)(ˆ nx
)()(1 nnwnn usignesignstepww +=+
Requires synchronization (of data pattern & sample point) when switching between Phase 1 and Phase 2.
Significant overhead for synchronization.
Phase 1Send the data sequence through unequalized channel to get u(n)
Phase 2Repeat the data sequence through equalized channel to get e(n), and update the equalizer.
8
15
Solution for u(n) – (2) Give up MMSE and do ZF only
Simple update:Use the instead of for ZF adaptation from the start
sign( ) can have a high BER in the early part of adaptationNeed to use block averaging before the update
)ˆ()(1 nnwnn xsignesignstepww +=+
nx̂ nu
-x (n-∆)
-e (n)x (n)
channelequalizery (n)
, e (n)back channel
)(ˆ nx
Decision directed equalization
nx̂
)(ˆ nx
16
Agenda
System overview/requirementsStandard (unconstrained) adaptive equalizationHardware constraint driven adaptation
Obtaining the error signalMissing signal in Tx Equalization u(n)Transmit output swing constraint
Alternative algorithms/cost functions
9
17
Transmitter output swing constraint
Output swing constraintSymbol is received attenuated
Need the second loop toAmplify the received symbol to restore the original value Adjust the expected symbol level
Need to normalize equalizer taps after each update
-e (n)x (n) channelequalizery (n)
u (n), e (n) back channel
output swing
)(ˆ nxg ⋅
)( ∆−− nx
)(ˆ nx g
0 0.5 1 1.5 2 2.5-25
-20
-15
-10
-5
0
frequency [GHz]
Atte
nuat
ion
[dB
]
equalized
unequalized
18
Getting around limited gain
-e (n)x (n) channelequalizery (n)
u (n), e (n) back channel
output swing
)(ˆ nx
)( ∆−⋅ nxdataLevel
Hard to get ~30-50GHz Gain-Bandwidth product in CMOS Use adaptive data levels
For error generationFor slicer thresholds
10
19
Reference loop
Data level reference loop
The dataLevel loop is similar to that of gain update)ˆ()(1 nngainnn xsignesignstepgg +=+
-e (n)x (n) channelequalizery (n)
u (n), e (n) back channel
output swing
)(ˆ nx
)( ∆−⋅ nxdataLevn
e (n) < 0
0
0)(ˆ >nx
dataLevn
)ˆ()(1 nndataLevnn xsignesignstepdataLevdataLev −=+
20
Reference data level
… …
dLevinitdLevmid
dLevend
Initial eye Mid-way equalized Equalized
Tracks the mean value of received signal around chosen constellation point
11
21
Dual-loop algorithmReference loop
Equalizer loop
One more thing left to considerMake the algorithm obey the output Tx constraint
)ˆ()(1 nndataLevnn xsignesignstepdataLevdataLev −=+
)ˆ()(1 nnwnn xsignesignstepww +=+
22
Peak transmitter output constraint translates to
Straightforward implementation
Perturbs the convergence a little, but not the optimal pointOptimal point for TxEq equalizer is at the maximum output swing ( w l1 norm surface)
Normalizing Tx EQ
1
max1 )(
nnnnn
updatewWupdateww
+⋅+=+
)ˆ()( nnwnxsignesignstepupdate =
max1Www nn <==∑
12
23
Taylor Series Approximation Method
Hardware efficient scaling approximation
)max
()()(
)max
1()(1
,max
max1
1)(
max
max)(
1
max)(1
,max1
WresidualW
nupdatenwnupdatenw
WresidualW
nupdatenwnw
WresidualWif
WresidualWnupdatenw
residualWW
Wnupdatenw
nupdatenw
Wnupdatenwnw
then
WnupdatenwresidualW
let
⋅+−+=
−⋅+≈+
<<
+⋅+=
+⋅+=
+⋅+=+
−+=
(Taylor Series Approximation)
24
Hardware implementationThe derived formula can be implemented very efficiently in hardware:
uses binary addition
is right shift by log2(Wmax) bits, as long as Wmax is power of 2. (i.e. 128)
Wresidual/Wmax - right shift by log2(Wresidual/Wmax) bits, if result is a power of 2 Wresidual can be from -5 to +5
Round Wresidual to ±0, 1, 2 or 4. When Wresidual is ±3, we alternate the rounding between ±2 and ±4.
)max
()()(1 WresidualW
nupdatenwnupdatenwnw ⋅+−+≈+
)( nupdatenw +
max
1W
13
25
Peak Power Main Tap Adjustment MethodAll taps, except the main will be updated normally:
wn(i) is the ith tap at time n
Main tap (or 2nd tap) adjustment will only depend on peak output swing.
Alternative hardware efficient scaling
5,4,3,1),()()(1 =+=+ iiupdateiwiw nnn
∆−=
≥∆+=
≤
+
+
)2()2(
)2()2(
1
1
1
1
nn
uppern
nn
lowern
wwWwelseifww
Wwif
26
Choose Wupper and Wlower to minimize dithering of the main tap (Wupper - Wlower) expected dithering of ||wn||1.
(e.g. Wupper is set to128, and Wlower is set to 120, for Wmax=128).∆ sets the adjustment step on the main tap.
∆ can be a function of the updates (i.e. ∆n = ||wn||1 – Wmax).
No multiplication required.Transmitter might not utilize peak swing
Maximum gap is Wupper - Wlower
In cases where ∆ is hardcoded, ||wn||1 can go above Wmax during equalizer adaptation
Alternative scaling contd.
14
27
Agenda
System overview/requirementsStandard (unconstrained) adaptive equalizationHardware constraint driven adaptation
Obtaining the error signalMissing signal in Tx Equalization u(n)Transmit output swing constraintConvergence results
Alternative algorithms/cost functions
28
Adaptive ZF EQ– Results
Convergence example, 5taps Tx EqLeft plot shows the convergence of TX EQ Taps after about 200 updates. Right figure shows the convergence of Data Level.
15
29
Dual loop convergence – 3 tap example
Hard to estimate analyticallySims and experimental results show
Both loops are stable within wide range 0.1 – 100x of relative speeds
0 20 40 60 80 1000
20
40
60
80
100
# updates
code
[lsb
]
(a)
(b)
(c)
(a) dLev speed 1x, eq speed 1x (b) dLev speed 10x, eq speed 1x (c) dLev speed 1x, eq speed 10x
dLev learning curve
0 20 40 60 80 100-50
050
100
code
[lsb
]
0 20 40 60 80 100-50
050
100
code
[lsb
]
0 20 40 60 80 100-50
050
100
# updates
code
[lsb
]
(a)
(b)
(c)
equalizer tap learning curves
30
Tx Eq. tap + data level adaptation
0 500 1000-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1rawequalized-learning
0 500 1000-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5referenceequalized-last
0 200 400 600 800 1000 1200-0.4
-0.2
0
0.2
0.4
0.6
0.8
1Eq tap learning curve
0 200 400 600 800 1000 1200-0.4
-0.2
0
0.2
0.4
0.6
0.8
1Eq tap learning curve
0 500 1000-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1rawequalized-learning
0 500 1000-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5referenceequalized-last
Dispersion only Dispersion + reflections
Receivedsignal
Eq taps
Reflections are treated as proportional noise –do not affect the eq. tap nor data level adaptation
16
31
Agenda
System overview/requirementsStandard (unconstrained) adaptive equalizationHardware constraint driven adaptationAlternative algorithms/cost functions
Min. BER driven adaptation
32
Min-BER driven 2PAM AdaptationOptimize equalizer directly for min BERAdaptive version – AMBER – (Yeh & Barry)Main idea – update only when in error
)()(IminBER1 nwnn xsignnstepww +=+
-1
+1
0
)(IminBER n
Received constellation
Very slow for low BER targetsNeed to narrow the indicator funciton
17
33
Implementing Min-BER 2PAM Adaptation
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
trap
MSB
x̂
)ˆ(xSign
For faster updates, add a trap zoneSimilar to data-filtered LMS, the MIN-BER driven 2PAM adaptation uses an indicator function (IMIN-BER).
(With adaptive sampler sitting at trap) This indicator function can be created using data filter of:(MSB = 1) && (ERR = 0)
+trap
0
)()0(
1 nwBERMINnn
nBERMIN
xsignstepIwwtrapxI
)
)
⋅+=≤≤=
−+
−
10
00
0
trap)ˆ( trapxSign −
34
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1-40
-35
-30
-25
-20
-15
-10
-5
0trap 0.1trap 0.05m mseEqhighNoisem mseEqlowNoise
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1-2.5
-2
-1.5
-1
-0.5
0trap 0.1trap 0.05m mseEqhighNoise
Margin [V]
Log10 BER Log10 BER
Margin [V]
sigmaA=50mVsigmaR=5mV sigmaA=50mV
sigmaR=50mV
sigmaA – noise used during adaptationsigmaR – actual noise of the system at which the BER curves are used
Slow convergence for small BER targets, have to use trap zone or add extra noise
Simulations with two trap settings (100mV and 50mV), signal levels around 200mV
Min-BER vs LMS(MMSE)
18
35
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.10
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02trap 0.1trap 0.05 m mseEqlowNoise
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1-10
-9
-8
-7
-6
-5
-4
-3
-2
-1trap 0.1trap 0.05mm seEq
[V] [V]
Log10 pISI pISI
Residual ISI probability distributionminBER eq algorithm tries to shape the ISI distributions such as to minimize the BER
36
Agenda
System overview/requirementsStandard (unconstrained) adaptive equalizationHardware constraint driven adaptationAlternative algorithms/cost functions
Min. BER driven adaptationBalancing data and edge errors
19
37
Voltage only equalization – (ZFE)
Clean voltage marginsBad timing marginsZFE causes multi-modal edge ISI jitterMultiple edge histogram peaks => bad for CDRWith jitter on tx and rx clocks voltage margin is not all that matters
0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 1 . 6
x 1 0- 1 0
-0 . 5
-0 . 4
-0 . 3
-0 . 2
-0 . 1
0
0 . 1
0 . 2
0 . 3
0 . 4
0 . 5
38
ZFE pre-emphasis causes multi-modal edges
While sample points are “clean” of ISI, the ISI shifts almost to quadrature (right at the edges)This impacts the timing of the crossing between the next symbol and the one after next symbol
250 300 350 400 450
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
Equalized (g) and Equalized with refl canc. (r) differential pulse response 5..5..5-17..1.6e-010 simdir/simdir52/v0.mdat
ISI peaks that hit the symbol edges
20
39
Control both data and edge samples
ZFE “looks” only at data samples and minimizes the errorNeed equalization algorithm that “looks” at both the data and edge samples and minimizes the errorForm a total cost function as:MSEtot=alpha*MSEdata+(1-alpha) *MSEedgeFor equalizer tap calculation need pulse response samples at both data and edge samples
40
Error function formulationEasy for data samples:ed(n)= data(n-delay)-data_received(n)For edges, first formulate the target:edge_target(n)=0.5*(data(n-delay)+data(n-delay+1)Then form the error:ee(n)=edge_target(n)-data_received(n-1/2)
ed
ee
21
41
Cost function
MSEtot=alpha*mean(ed2)+(1-alpha)*mean(ee2)
Need to balance weight alpha to tradeoff voltage and timing marginsUltimate answer is to tie alpha to the BER through Txand Rx jitter numbers and other voltage noise
42
For ZF (one phase algorithm)Instead of udn, use received data xnInstead of uen, use a filter for edge transitions, i.e. when xn+xn-1=0
)()()()(1
enenwe
dndnwdnn
usignesignstepusignesignstepww
+++=+
Adaptive EQ data and edge
)()(1 dndndnn usignesignstepdataLevdataLev −=+
Simple update for MMSE (two phase algorithm) ud, ue, from phase 1ed, ee from phase 2
22
43
0 0.5 1 1.5 2 2.5 3 3.5
-15
-10
-5
0
GHz
|H| [
dB]
VS5 - H(f) [dB]: channel, goal, eq, eq+channel
o.k. but why does this work?
0 0.5 1 1.5 2 2.5 3 3.5-15
-10
-5
0
GHz|H
| [dB
]
VS5 - H(f) [dB]: channel, goal, eq, eq+channel
The amount of pre-emphasis at sample times that “cleans” the data samples is traded for the error in edge samplesEqualizer is “told” not to attenuate low frequencies as much
44
Oversampled frequency response
0 0.5 1 1.5 2 2.5 3
x 109
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9Raw (b), Equalized (g) and Equalized with refl canc. (r) frequency response)
0 0.5 1 1.5 2 2.5 3
x 109
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9Raw (b), Equalized (g) and Equalized with refl canc. (r) frequency response)
Data ZFE Data-Edge ZFE
23
45
Equalized pulse responses
250 300 350 400 450
0
0.05
0.1
0.15
0.2
0.25
Equalized (g) and Equalized with refl canc. (r) differential pulse response 5..5..5-17..1.6e-010 simdir/simdir52/v0.mdat
250 300 350 400 450
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
Equalized (g) and Equalized with refl canc. (r) differential pulse response 5..5..5-17..1.6e-010 simdir/simdir52/v0.mdat
Data ZFE Data-Edge ZFE
460 0.5 1 1.5
x 10-10
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
x 10-10
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Eye diagrams
Eye/schmooW=131/128ps
Eye/schmooH=161/165mV
Eye/schmooH=157/157mV
Eye/schmooW=140/136ps
Bimodaledges Unimodal
edges
24
47
Raw pulse response
5.2 5.4 5.6 5.8 6 6.2 6.4 6.6
x 10-9
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
sec
sbR
Raw pulse response 5..5..5-17..1.6e-010 simdir/simdir52/v0.mdat
48
We’ve covered todaySystem overview/requirementsStandard (unconstrained) adaptive equalizationHardware constraint driven adaptation
Obtaining the error signalMissing signal in Tx Equalization u(n)Transmit output swing constraintConvergence results
Alternative algorithms/cost functionsMin. BER driven adaptationBalancing data and edge errors
After we learn about clock and data recovery and other system issues, we will get back to this in the context of link system performance (BER) analysis
25
49
To probe furtherB. Widrow et al,, “Stationary and nonstationary learning characteristics of the LMS adaptive filter,” Proc. IEEE, vol. 64, no. 8, pp. 1151-1162, 1976.V. Stojanović, G. Ginis and M. A. Horowitz, "Transmit Pre-emphasis for High-Speed Time-Division-Multiplexed Serial Link Transceiver," IEEE International Conference on Communications, pp. 1934 -1939, May 2002.J.T. Stonick et al, "An adaptive pam-4 5-Gb/s backplane transceiver in 0.25-µµµµm CMOS," IEEE J. Solid-State Circuits, vol. 38, no. 3, March 2003, pp. 436-443. C-C. Yeh and J. R. Barry, “Adaptive Minimum Bit-Error Rate Equalization for Binary Signaling,” IEEE Transactions on Communications, vol. 48, no. 7, July 2000V. Stojanovic, A. Ho, B. Garlepp, F. Chen, J. Wei, E. Alon, C. Werner, J. Zerbe, and M.A. Horowitz, “Adaptive Equalization and Data Recovery in a Dual-Mode (PAM2/4) Serial Link Transceiver,” submitted to IEEE Symposium on VLSI Circuits, June 2004.
If you have any questions send me [email protected]