Dynamic Element Matching

1 Dynamic Element Matching

1.1 The mismatch issue A DAC converts a digital number d[n] into an analog voltage v[k] or current i[k]. It also usually holds this voltage or current until the next sample, meaning that it converts the signal to continuous time as a first-order hold function.

D=>AFirstorderhold

d[k] )(ˆ),(ˆ

ti

tv

][],[

kikv

Figure 1: DAC basic principle

The first order hold as known has a SINC-frequency response. In the following, we will however focus on the first D=>A block. The DAC is typically implemented using DAC elements, these being current sources (in the case of the analog output value being a current value) or capacitors (in the case of the analog output value being a voltage value). The simplest way to implement a binary DAC is to use binary weighted elements like shown in fig.2.

+

1

2N-1

2N-2

2N-3

b0

bN-1

bN-2

bN-3

K0

KN-3

KN-2

KN-1

a(k)

Figure 2: Binary weighted element DAC

Here, bi is the i’th bit of the digital number d(k) The LSB b0 is scaled by 1 and the MSB bN-1 by 2N-1.The DAC elements can as mentioned be capacitor elements (in which case one is used for b0, two for b1 and so forth) or scaled current sources. The nominal gain of the DAC, normalized to 1, will be given by:

∑

∑−

=

−

== 1

0

1

0

2

2ˆ

N

i

i

i

N

i

i KK (1)

The INL for any digital value 110)( −+++= Nbbbkd L will then be given by:

( )∑∑∑−

=

−

=

−

=

−=−=1

0

1

0

1

0

ˆ2ˆ22)(N

iii

iN

ii

iN

iii

i KKbKbKbkINL (2)

As can be seen, the MSB or close to MSB elements will be very critical and because of this, the binary weighted DAC is not very useful for high resolution applications. An alternative way of looking at it is by viewing the scale factor 2i as the size ratio between element 0 and i. Obviously it’s difficult to match two elements whose size ratio is large. For high resolution converters, the converter is often implemented as a unit-element DAC. This is shown in figure 3. The binary number is decoded to thermometer code representation and fed to 2N-1 equal unit elements.

+

t0

bN-1

b0

K1

a(k)Binaryto

therm.decoder

12NK−

22NK−

32NK−

32Nt−

22Nt−

12Nt−

Figure 3: Unit element DAC.

In the unit-element DAC all elements are of equal (LSB) size. The nominal gain of the DAC will be given by:

Ni

i

N

KK

2ˆ

12

0∑

−

== (3)

The INL for any value d(k) will consequently be given by:

( )∑∑−

==

−=⋅−=12

0

)(

0

ˆˆ)()(N

iii

kd

ii KKtKkdKkINL (4)

Now the relative mismatch is much less critical. If it is 1% for any given element, the DNL will only be 0.01 LSB.

1.2 Dynamic element matching in oversampled unit element DACs In an audio DAC, the number of bits is reduced from 20+ to much fewer, usually 3-6, through the means of an oversampled delta-sigma modulator. The additional quantization noise is shaped arbitrarily through the modulator, and very high resolution can still be maintained for the baseband. However, errors in the D/A-converter are not shaped by the loop and its static performance must still be at the 20+ bit level. In a usual CMOS-process, matching of individual capacitors or current-sources are usually limited to about 0.1%, resulting in a static error much inferior to this requirement. However, these errors can too be shaped digitally, an idea first proposed by Van De Plassche back in 1976 [1].

1.2.1 Randomisation The simplest method of dynamic element matching is randomisation of the element selection, first shown in implementation in 1989 [2]. As such, the element usage will be input independent and the INL will be randomised and input independent (assuming a normal or Gaussian mismatch distribution with zero mean). In other words, it is converted from distortion to noise. A random selector can be implemented through a butterfly-network and a pseudorandom number generator as shown in fig.4.

+

bN-1

b0

K1

a(k)Binaryto

therm.decoder

12NK−

22NK−

32NK−

22Nt−

12Nt−

PRNG

t1

32Nt−

Figure 4: Element randomizer

Since the elements are selected randomly, the output will be a white noise source given by:

{ }

2

2

212

0

1

2

0

2

21

21

22

2

ˆ

KN

KNNN

Ni

id

ii

d

ii

dd

dd

KdKE

KdKEE

N

σ

σ

ε

⎟⎠⎞

⎜⎝⎛ −⋅=

⎟⎠⎞

⎜⎝⎛ −⋅⎟

⎠⎞

⎜⎝⎛⋅=

⎪⎪

⎭

⎪⎪

⎬

⎫

⎪⎪

⎩

⎪⎪

⎨

⎧

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅−=

⎪⎭

⎪⎬⎫

⎪⎩

⎪⎨⎧

⎟⎠

⎞⎜⎝

⎛⋅−=

∑∑

∑

−

=

=

=

(5)

If the noise is white and OSR⋅21 of the noise power falls within the baseband, a maximum

signal swing of 2N-1 peak-to-peak will result in a maximal Signal-to-Mismatch-Noise-Ratio of:

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛=

⋅

εσ3

2

2logOSRN

SMNR [bits] (6)

It can easily be calculated that with a typical process mismatch of 0.1%, and a reasonable number of bits and a reasonable oversampling ratio (tens to hundreds), the resolution is still limited to the sub-16-bit range.

1.2.2 Mismatch shaping and Data Weighted Averaging An improvement of element randomization came with the introduction of mismatch-shaping techniques in the early 90s, the Individual Level Averaging (ILA) algorithm from 1992 [3] and the Data Weighted Averaging (DWA) algorithm from 1995 [4] representing major breakthroughs. They were both based on the idea that if all elements contribute equally, the INL would be cancelled, since the total error would then always have the same expectation value. So, if all elements contribute equally over time, the error will grow smaller for larger time windows, that is, lower frequencies. The error will be high-pass shaped, which is very desirable in oversampled systems. Especially the DWA algorithm became very popular and is widely used and the success of DWA was instrumental in the resurrection of multi-bit DACs in audio applications. Following its introduction, more thorough analyses of DWA were later published [5]-[6] that proved its first-order high pass mismatch shaping property as well as its susceptibility to idle-tone behaviour. As a consequence of the latter, many randomized DWA algorithms have been published [7]-[9]. Extension of the DWA principle to include arbitrary noise-shaping functions has also been shown [6], although none can be practically implemented in a reasonable way. However, a practically implementable restricted second-order DWA has been published proving good results [10].

1.2.2.1 First order DWA The principle of DWA is shown in fig. 5. If the first value d(0)=2, elements t0..1 are selected, if d(1)=3, elements t3..4 are selected next. As can be seen from the 3-bit example to the right in the figure, each element is used twice over the period. Of course, it will normally converge much slower.

12NK−

22NK−

32NK−

22Nt−

12Nt−

32Nt−

Figure 5: Data Weighted Averaging mismatch shaping

The shifting is controlled by a pointer that is incremented by the input value d(k) for each sample instant k. To be rotational it operates in modulo N, e.g.:

( ) Nkdkptrkptr mod)()1()( +−= (7) The integral mismatch error as a function of the pointer is of course given by:

KptrKptrIMptr

ii

ˆ)(0

⋅−= ∑=

(8)

It is then given from (7) that the mismatch as a function of the time instant k must be:

[ ] [ ][ ])()1()(

)1()()(1 zPTRIMzz

kptrIMkptrIMk−−=Ε⇒

−−=ε (9)

The resulting noise is in other words a first order high-pass filtered version of the integral mismatch error generated as a function of the pointer. This is again dependent on the input signal as seen by (7). The integral mismatch error IM(ptr) is often considered white, but this is a simplification that could lead to erroneous performance estimation. Especially in the presence of certain DC-level input signals, IM(ptr) will be periodic (due to the modulo term) and produce significant tonal behaviour. This tonal behaviour is the biggest drawback with first order error shaping just like tones are also a major problem in first order quantization noise shaping (delta-sigma modulation). If the white noise assumption holds, the baseband resolution can be calculated very similarly to randomization, for maximum amplitude white input signals it is found to be:

( ) ⎟⎟⎠⎞

⎜⎜⎝

⎛

−⋅

=N

OSRSMNRN

21

3

2 12logεπσ

[bits] (10)

The tones can be estimated for DC-inputs, if the input signal is a DC-value X, the errors will be rotated at a fixed speed, with one rotation being completed every 2N/r sample, where r is the smallest common divisor of X and 2N. The output error will then be a periodic sequence:

KK ∑∑∑∑−

=

−

−=

−

=

−

=

1

0

12

2

121

0,,

X

ii

Xii

X

Xii

X

ii

N

N

εεεε (11)

The tonal behaviour is addressed similarly to tones in a delta-sigma modulator, by ways of dithering the selection process to break up the periodic pattern [7]-[9].

1.2.2.2 Second order DWA The approach in section 1.2.2 can be generalized to encompass second order or arbitrary noise shaping. We want to achieve:

[ ])()()( zPTRIMzHz ⋅=Ε (12) Where H(z) is a general noise shaping function:

∑=

−=m

j

jj zazH

0)( (13)

In the time-domain, (12) simply becomes:

[ ]∑=

−⋅=m

jj jkptrIMak

0

)()(ε (14)

For 11)( −−= zzH , we have already shown that this can be achieved through the single

step ( ) Nkdkptrkptr mod)()1()( +−= . For ( )211)( −−= zzH we get:

[ ] [ ] [ ])2()1(2)()( −+−⋅−= kptrIMkptrIMkptrIMkε (15) To achieve this, some elements, the elements up to the current pointer value must contribute with weight “1”, the elements up to the previous pointer value must contribute with weight “-2” and the elements up to the pointer value two instances ago must contribute with weight “1”. Since these are the same elements, each element’s weight at any time instance is given by the sum of the three terms given by the pointer recursion. In the first order case it will always be “-1” or “1” (one and zero binary), and it by updated through a single assignment of each element. In the second order case the sum can grow to larger integers. Hence, to enable an element to contribute proportionally, each element must be assigned several times in each sample instant, making practical implementation impossible unless the DAC is run at several times its input sampling speed. However, a modified version of the second order DWA has been implemented successfully, called the Restricted Second-order DWA or R2DWA. R2DWA puts a restriction on the element selection, by updating the pointer like second order DWA, but through a restriction

on the element selection ensuring that no element must be assigned more than one time. This principle is shown in fig.6.

Figure 6: Left: Second order DWA. Right: Restricted second order DWA.

R2DWA can be implemented by updating each row in a swapper array with a second-order delta-sigma modulator. The implementation is detailed in [10].

1.2.3 The Galton Tree Structure Dynamic Element Matching In addition to the very intuitive DWA approach, several other algorithms for dynamic element matching have been introduced during recent years. These algorithms are often more difficult to understand and appear more complex, but some have proved to be much simpler to implement in hardware.

1.2.3.1 The tree structure and mode of operation A very hardware-efficient realization of dynamic element matching is the tree-structure proposed by Galton [11]. Here, the decoder is partitioned into layers, until ending with the 1-bit two level data necessary to control a 1-bit DAC. This is shown in fig.7. The control units denoted S are called switching blocks.

12NK−

N2 2K

−

N2 3K

−

N2 4K

−

Figure 7: The Galton tree structure

To preserve the signal integrity, the structure must be number conserving, i.e. that the sum of the two outputs from a switching block must always equal its input. This is called the number conservation rule. In addition, the input to any switching block must be equal to or smaller

than the total number of DAC-elements it connects a path to. E.g. the input to the blocks in layer two can never be larger than four. As such, the number of bits can be reduced from layer to layer, as shown in fig.7. Since an N-bit binary number has the range [0,2N-1] the inputs are encoded with an extra LSB to have the range [0,2N]. This is called extra-LSB encoding. The implementation of the switching block, to be in conformance with these rules, is as shown in fig.8.

+

+

-1

X

s

0.5

0.5

Y1

Y2

Figure 8: The basic switching block.

From fig.8, it is clear that:

( )

( )SXY

SXY

−=

+=

2121

2

1

(16)

It is easy to see that 21 YYX += , so the number conservation rule holds, and also that

SYY =− 21 . The signal S is called the switching sequence and controls the partition of the input into the two output signals, hence the dynamic element matching is encoded in S. It can be shown using recursion for the whole tree (see [11]), that:

[ ] [ ] [ ]nenDnA inout +⋅+= αβ (17) Where β and α are constant offset and gain errors, respectively. The term [ ]ne is signal dependent and denotes the nonlinear error resulting from mismatch. Furthermore, it can be shown [11] that:

[ ] [ ]∑∑= =

−

⋅∆=b

k rrkrk

kb

nsne1

2

1,, (18)

Where k denotes the layer (b layers in total), r denotes the position of the switching element in

the layer, rk ,∆ is the total mismatch error associated with switching block r in layer k and rks , is the switching sequence produced by switching block r in layer k. This means that if

each switching block produces an L’th order shaped sequence rks , , uncorrelated from the other switching sequences, then the total error e will be shaped the same way. Consequently, one should be able to produce the switching sequence using any arbitrary free-running delta-modulator and the error will be shaped in the same way. However, there are some other requirements on the switching sequence that complicates this. For the outputs to be on integer form, it is an obvious requirement that both ( )SX + and ( )SX − must always be even numbers. This puts a restriction on S, namely that:

[ ] [ ][ ]⎩

⎨⎧

=odd is if oddeven is ifeven

nxnx

nS (19)

Furthermore, to ensure 1Y and 2Y are both positive and smaller than the total number of elements they are connected to ( 1

2,1 2 −≤ kY for any switching block in layer k), it is another requirement that:

[ ] [ ] [ ]{ }nxnxnS k −=≤ 2,min (20) The challenge in the tree structure is consequently to generate a sequence that is both shaped and at the same times holds these requirements. But as will be shown, this is indeed feasible, although higher order shaped sequences will saturate significantly.

1.2.3.2 The swiching sequence generator To efficiently generate a shaped sequence that satisfies the before-mentioned requirements, the sequence generator is implemented as a free-running, ternary output delta-sigma modulator. The fact that the outputted switching sequence is ternary means that the requirement in (20) is automatically satisfied. To satisfy (19), the ternary output is generated by a binary quantizer (±1 output) and a multiplier, as shown in fig.9. Since the LSB of X is 0b when X is even:

[ ] [ ][ ]⎩

⎨⎧±

=odd is if 1

even is if 0nx

nxnS (21)

As a consequence, both (19) and (20) are satisfied. The filter H(z) decides the shaping of the output sequence and is in the first order case a simple discrete-time integrator.

111

−− z

Figure 9: Conceptual schematic of the first order switching sequence generator.

In this first order structure, it can easily be verified that the integrator output will always be 0 or ±1, so the quantizer will never overload. This means it will always work as an ideal first-order modulator and provide ideal first-order shaping. It can also be implemented in a very simple manner. For higher-order switching generators, implementation is somewhat more complex, which will be explained shortly.

1.2.3.3 Hardware efficient 1st order switching network A simple implementation of the described switching block structure is achieved through “extra-LSB-encoding” of the data sequence. This means the binary code is modified to have two LSBs, both with unity weight. If the number is odd, the two LSBs are necessarily different, if the number is even, they are equal. This simplifies the logic as the switching sequence can be added without any adder. If the number is even, the switching sequence output must be zero and the switching block should only perform a right shift division of X for both Y1 and Y2. One of the LSBs from X must be added as the extra LSB of Y1,2, since the two LSBs being 11b means they constitute a value of 2 and 1 should be added to each output (see (16)). If the number is odd, the switching sequence is non-zero and must be added to and subtracted from Y1 and Y2 respectively. When using the extra-LSB-encoding, S can be added to or subtracted from the outputs (see fig.8) without an adder. By discarding the two LSBs of X, which are now 01b or 10b, you have reduced the output value by 0.5. This means that setting the extra LSB of the output to 0b means ⎣ ⎦2/XY = and setting the extra LSB to 1b means

⎡ ⎤2/XY = . Thus, setting the extra LSB of Y1 to 1b if 1=S and 0b if 1−=S , is equivalent to combining (16) and (21). The inversion of S for Y2 (see fig.8) is simply done through a binary inverter.

Figure 10: Switching network logic with extra LSB encoding.

It’s easy to confirm fig.10 is equivalent to fig.8 with number conservation and a ternary switching sequence and the arithmetic has been realized using very simple digital logic. It will also be shown that the sequence generator for low order can be realized very elegantly. We first consider the first-order modulator in fig.9. The integrator can be seen as a ternary state-machine. If X is odd, the modulator output S is zero and the integrator state will not change (see fig.9). This means the state machine can be stopped. If X is even, the integrator will follow the state sequence [1,0,-1,0,1,0,-1...]. Whenever the integrator state is -1, the output must be 0b, when the integrator state is 1, the output must be 1b. The integrator must change state from 1 to -1 in two cycles. This leads to the simple implementation shown in fig.11.

Figure 11: Simple implementation of sequence generator.

If the flip-flop outputs Q1Q2 are 11b, the integrator state is 1. If they are 00b, the state is -1 and if they are either 01b or 10b, the state is 0. This way, the binary quantizer from fig.9 is with no offset (if it rounds zero up or down depends on the last value). To avoid tonal behaviour, which will translate to tones in the DAC error (18), the filter can be dithered. If the integrator output is 0, one can randomize which way the binary quantizer flips (1 or -1) by using a binary random input as shown in fig.12. Here, if the integrator state is 0, the output is determined from the random sequence.

Figure 12: Dithered first order sequence generator.

1.2.3.4 The second order switching generator. The second order switching generator is described in [11] and detailed in [12]-[13]. Since the integrator output is now no longer limited to 0 or ±1 state values, implementation will be significantly more complex. The basic structure is a two integrator expansion of the first order structure shown previously in fig.9. However, in the second order structure the output from the second integrator can grow to any value and if it’s saturating the circuit can become unstable. To avoid the danger of instability even with a limited size second integrator, a feed-forward coefficient α<1 is introduced. The conceptual block schematic is shown in fig.12.

111

−− z

LSB of X

S

-1

111

−− z+

α

Figure 13: Conceptual schematic, second order switching generator.

The design can be simplified by recognising that, if the second integrator is designed to saturate at some value M, a gain of α=1/(M+1) can be achieved by allowing the first integrator to override the second one whenever its output is nonzero. In digital logic, the dithered version of the second order generator in fig.13 can then be implemented as shown in fig.14.

Figure 14: Efficient hardware implementation of second order switching generator

To explain the way this circuit operates let’s first look at the two integrators: The state of the first one will never grow beyond ±1, from the same reasoning as the first order integrator, and it can be implemented as a 2-bit state machine. The second integrator output can grow to, in theory, any integer value. As is seen, if the first integrator is not zero (the >0 output is 1b), then this integrator overrides the second one and Sb is equal to its sign. It then also disables the second integrator. In this case the circuit works exactly like the first order structure in fig.11 or fig.12. If the first integrator is zero, the second integrator takes over and its sign is connected to the output Sb. If this is also at the zero state, the output is determined from a dither sequence as for the first order version. Since the second integrator saturates at some value M, and the first order integrator in this case takes over operation, the gain α and the resulting noise shaping of the switching sequence S is determined by the size of the second integrator. If its saturating range is infinite, the gain

α is zero and the provides the optimal second order ( )211 −− z spectral shaping of S. If the saturating range is 0, the gain α is infinite and the circuit falls back to first-order ( )11 −− z -spectral shaping. Any value in between gives a reduced, or more conservative, second order shaping. It has been shown in [12]-[13] that if the second integrator is only 2-bit (M=1), the baseband noise floor is about 10dB lower than in the first-order dithered case. If the second integrator is 6-bit (M=32) or more, the shaping is close to second order. Of course a larger up/down-counter means more hardware, and there are 2N-1 switching generators for a N-bit tree-structure DEM, so the trade off between hardware complexity and required performance should be carefully evaluated. A 2-bit second integrator gives a quite cheap performance boost, while more than 6-bit is rarely needed unless perfect ( )211 −− z -shaping of mismatch errors is necessary to provide sufficient baseband SNDR. Suggestions of improvements to the Galton Tree-structure to increase stability for higher order shaping, have also been shown in recent publications [14]-[15].

1.2.4 Other structures. In addition to the DWA-structure and the Tree-structure there are two other fundamental approaches of Dynamic Element Matching to the author’s knowledge that are published. One is the vector feedback approach [16]-[18], which has very good performance for even second order mismatch shaping, but suffers from high hardware complexity and is not feasible to implement for a large number of quantization levels. The butterfly shuffler method [19]-[20] on the other hand is computationally simple, but limited to first order mismatch shaping. These methods will not be treated in detail in this document and interested readers are suggested to look up the references.

1.3 Dynamic element matching for large quantizers As we have seen from the previous sections, the number of elements increases exponentially with the number of bits in the output signal. This means that both routing and DEM complexity also increases exponentially with the number of elements. Generally, most DEM DACs are inefficient at more than 4-5 bits. Of course, DEM can not be applied to the binary weighted DAC, as it has only one unique element selection per output code. However, the complexity can be reduced through segmentation. This is shown in its basic form in fig.15.

Figure 15: Segmented DEM scrambler

Here, the lowest M bits are subjected to normal dynamic element matching (e.g. DWA) and run through a 2N-element unit DAC with weight 1. The remaining N-M bits are also run through a DEM scrambler, before being converted by a separate equal-element DAC. The MSB DAC is of course weighted by 2M to make the digital sum correspond to the modulator output code. If N=8, this would normally require 256 elements in a unit element DAC. By setting M=4, the LSB-DAC can be made with 16 elements and the MSB-DAC also with 16-elements, each of these being 16 times bigger than the elements for the LSB-DAC. The algorithm complexity has thus been reduced from 256 elements to 32. However, the weighting is still realized in analog, i.e. by matching of the LSB and MSB DACs. This well result in a gain-error between the DACs that, in the configuration shown above, is not cancelled in any way. This will again result in significant distortion. Another way to view this is to look at the segmentation in the signal domain. Picking out the MSBs is equivalent to truncation and it is the truncation error that is fed to the lower DAC. If the coarse and fine DACs are ideally matched, the output will sum perfectly and the additional truncation error will not leak through, no additional error is made.

Figure 16: Segmented DEM scrambler with segmentation in the signal domain.

If the matching is not ideal however, the output will not sum correctly and the truncation error e will leak through. We can assume the fine DAC being weighted with a non-ideal factor α, instead of the ideal factor 1, in which case:

( )1outA X eα= + − ⋅ (22) If there is 1% gain-error, e.g. α=0.99, then 1% of the truncation noise from an N-M bit truncation leaks to the output. If, in the above figure, N=8 and M=4, then 1% of a 4-bit truncation error will be visible at the output, which will clearly lead to much too high distortion for 20 or so ENOBs. This problem can be minimized however, by introducing an additional delta-sigma modulator like shown in fig.17. This method was first proposed by Adams [20].

Delta-Sigma Modulator

N

N-M msb's

M+K lsb's

N-M bitDEM

M+K bitDEM

W=2M

W=1

2M+K-element DAC

2N-M-element DAC

+

+

x+eDSM

-eDSM

Aout

Figure 17: Segmented DEM scrambler with shaping of gain error

In the case seen in fig.17, a fine DAC weighting of α instead of 1, will now lead to a combined output given by:

( )1out DSMA X eα= + − ⋅ (23) Now, the error that leaks through is the shaped quantization error from a delta-sigma modulator, which contains very little energy in the baseband. The gain-error distortion has been shaped out of the band of interest. The form of the shaping is of course determined completely by the second modulator which divides the output from the main delta-sigma quantizer into a coarse and fine part. If the mismatch is 1%, 1% of the additional error generated by the second modulator will leak through. Thus, its performance can at worst be 100 times better than the second modulator.

If the error is signal-dependent, for instance if the modulator is first-order and undithered, the modulator will produce some baseband distortion and idle tones that leak through and a higher-order modulator appears more desirable. However, in a higher order modulator, the peak DSM error is significantly larger than the truncation error and the lower fine DAC would need many additional bits (in fig.17, K is many). Then the analog overhead will be greater and the digital complexity reduction will be lost.

First-orderdelta-sigma Modulator

N

N-M msb's

M+1 lsb's

N-M-bitDEM

M+1 bitDEM

W=2M

W=1

2M+1-element DAC

2N-M-element DAC

+

+

x+eDSM

-eDSM

Aout

Figure 18: Segmented DEM scrambler with first order shaping of gain error

If the STF is one and the NTF equals (1-z-1) the peak DSM error is on the other hand only twice the peak quantization error and the lower DAC needs only one extra bit, as seen in fig.18. Using a 1st order modulator is thus the preferred choice in published literature [20]-[21], but matching must be so good that the leakage does not significantly compromise performance. With 1% scale error, the output error caused by the undithered first order modulator will be 100 times better than its quantization error, so the expected output distortion can easily be estimated. Now, as the observant reader might have figured out, there is no reason you can’t do this again with the error signal. Figure 19 shows a two step segmentation using three separate first-order modulators. Here a 13-bit word is reduced to four 4-bit terms and the DEM complexity is reduced from 213 to 4·24.

Figure 19: Extension to two-step DAC segmentation.

As shown in the thesis of Steensgaard [21], this can be extended until you have a row of three-level DACs, each with a very simple DEM cell and a weight twice that of the next one. This is conceptually very similar to the pipeline ADC with 1.5-bit shaping, but with delta-sigma shaping of the gain-error between stages. The DEM complexity now increases proportionally to N instead of 2N for the regular DEM and this is very suitable for large quantizers at the cost of some analog overhead. The latter can be seen from noting that in fig.19 the analog area in terms of unit elements has increased from 8192 to 9360.

Not surprisingly, since the switching sequence generator in the tree-structure DEM is very similar to a delta-sigma modulator, the same approach can be applied to the Galton tree structure which can be transformed to a segmenting encoder by a slight change in the switching block networks [22]-[23]. The general structure is shown in fig.20. Of course, the degree of segmentation is adjustable like in the Steensgaard approach. In fig.20, two bits are not segmented, while the rest are.

Figure 20: The general segmented Galton Tree structure.

1.4 Element mismatch cancellation for low sampling rate material As an alternative to the mismatch approach, one could imagine a scheme where all the elements contribute equally regardless of the input value. Then the mismatch, as a combination of each element’s individual mismatch, would only result in a constant gain-error. The mismatch induced noise would not be shaped, but rather completely cancelled from sample to sample. Of course, it would be a necessity for such a concept that each element is modulated, so that the combined output is still proportional to the input signal. Such a scheme has indeed been suggested based on PWM (Pulse Width Modulation) of the unit elements [25], with extremely good results for audio bandwidths. The elements are PWM modulated in a special way for a mismatch cancelled, ISI and jitter insensitive, non-PWM output stream. The main drawback is, like we will soon see, that this is not possible to implement for high sampling rate material. A more traditional PWM approach with predistortion and DEM is also shown in [26] with good results, though its implementation specifics will not be covered in this paper.

1.4.1 Shifted and rotated PWM modulation of unit elements. The algorithm suggested by Reefman [25] is a very elegant and simple to implement solution based on a rotational PWM scheme. First, we will look at a straightforward PWM-modulation of all elements as shown for an example 3-bit continuous time DAC as shown in fig.21.

∑= DA

Figure 21: Straightforward PWM modulation of each unit element.

In the case straightforward PWM modulation of all unit elements, the total combined output is also PWM modulated. This is fully equivalent to direct Uniform PWM (UPWM) modulation of the input signal, which is well known to introduce distortion [24]. Also, although being ISI-free due to equal switching in each sample regardless of input value (the PWM can be seen as a sort of RTZ-code), the output transitions are very large and the output is very susceptible to dynamic errors from clock jitter. These shortcomings can however be overcome by skewing and rotating each unit element, as shown in fig. 22.

x[n]=3 x[n+1]=4 x[n+2]=5

D0

D1

D2

D3

D4

D5

D6

D7

12345678

0

∑= DA

Figure 22: Shifted and rotated PWM scheme [22]

Now, we see that the combined output is no longer PWM-modulated, it’s recombined to a linear PCM representation of the digital input amplitude. This means that the UPWM distortion is eliminated and the mismatch is completely cancelled by every element contributing equally regardless of input value. In addition, it is seen that each element switches on and off exactly once per sample, so none have any ISI. This means that the recombined output is also completely ISI free. The output is also not more susceptible to jitter distortion than any straightforward mismatch-shaping encoder. To summarize, the benefits of this algorithm include:

• Complete elimination of distortion due to static element mismatch.1 • Complete elimination of ISI without RTZ or similar dedicated schemes. • No increase in jitter sensitivity.

These are major advantages, and indeed the converter reported in [23] has exceptional performance, achieving 115dB SNDR in a 0.18µm process with very low power consumption. However, there are some disadvantages. The first is obviously the usable range. Due to the time resolution of the PWM modulation, the maximum clock frequency used to align the element PWM pulses will be given by:

NbitssPWM OSRff 2⋅⋅= (24)

This means that for high-resolution devices, relying on either high oversampling or a high number of bits, it will not be feasible to implement for bandwidths in the MHz-range. However, for audio sample rates (typically 48kHz or 96kHz), you can still achieve high inband SQNR. The second major disadvantage is illustrated in fig.23. We can see that if the input jumps more than 1 LSB from one sample to the next, the elements it skips (in this case D4) are switched on and off more often than the others. This means the ISI-eliminating property is lost.

1 Actually, mismatch in an element will produce a very small UPWM-type distortion contribution as shown in [22]. However this is a very low order effect that is not significant for any reasonable mismatch levels.

∑= DA

Figure 23: Illustration of ISI elimination being lost if |x[n]-x[n-1]|>1

Since ISI is a dynamic error source that can create severe distortion, this means that the converter published in [25] is designed so the output from the delta-sigma modulator is not allowed to skip more than 1 LSB per sample. This mandates a very conservative NTF, which reduces inband SQNR. To guarantee the property of [ ] [ ] 11 ±−= nxnx the converter in [25] in addition uses a hard slew-limiter inside the delta-sigma loop. Simulations suggest that for the delta-sigma loop to remain stable with this slew-limiter, it is necessary that 5.1<<

∞NTF , perhaps as low as 1.2.

With an oversampling ratio of 128, 48kHz input sampling rate and 6 bit quantization, this gives around 130dB inband SQNR, which is sufficient. However, the maximum clock frequency as dictated by (24), is almost 400MHz. In [25], 128xOSR and a 5-bit ∆Σ is used and around 127dB SQNR is reported. To achieve SQNR in the 140dB range, a clock speed of several GHz would be necessary. So the upper performance and bandwidth is limited by this requirement. For audio applications it has however proved to provide the best performance reported to date [25].

[1]: R.J.Van De Plassche, "Dynamic element matching for high accuracy monolithic DA converters", IEEE Trans. Circuits and Systems, SC-11, pp. 795-800, 12/76.

[2]: L.R.Carley, "A noise shaping coder topology for 15+ bit converters", IEEE J. Solid

State Circuits, SC:-24, pp 267-273, 04/89. [3]: B.H. Leung and S. Sutarja: "Multibit Sigma-Delta A/D Converter Incorporating A

Novel Class of Dynamic Element Matching Techniques," IEEE Trans. Circuits and Systems-II: Analog and Digital Signal Processing, vol. 39, No. 1 (1992), pp. 35-51.

[4]: R.T.Baird, T.S.Fiez, "Linearity enhancement of multi-bit ∆-Σ A/D and D/A converters using data weighted averaging", IEEE Trans. Circuits & Systems II, CASII-42, pp. 753-762. 12/95.

[5]: O.J.A.P. Nys, R.K. Henderson: An analysis of dynamic element matching techniques in sigma-delta modulation”, Proc. IEEE Int. Symp. Circuits and Systems, ISCAS '96., 'Connecting the World', vol. 1 , 12-15 May 1996

[6]: R. K. Henderson and O. Nys, "Dynamic element matching techniques with arbitrary noise shaping function”, Proc. IEEE Int. Symp. Circuits and Systems, ISCAS’96, May 1996, pp. 293--296.

[7]: K.D. Chen and T.H. Kuo: “An Improved Technique for Reducing Baseband Tones in Sigma-Delta Employing Data Weighted Averaging Algorithms Without Adding Dither”, IEEE Trans. Circuits and Systems II, vol.46, no.1, pp 53-68, 1999.

[8]: M. Vadipour: “Techniques for Preventing Tonal Behaviour of Data Weighted Averaging Algorithm in ∆Σ-Modulators”, IEEE Trans. Circuits and Systems II, vol.47, no.11, pp 1137-1144, Nov.2000.

[9]: A.A. Hamoui and K. Martin, "Linearity enhancement of multibit ∆Σ modulators using pseudo data-weighted averaging," Proc. IEEE International Symp. Circuits and Systems ISCAS’02, May 2002, pp. III 285-288.

[10]: Xue-Mei Gong; Gaalaas, E.; Alexander, M.; Hester, D.; Walburger, E.; Bian, J.: “A 120 dB multi-bit SC audio DAC with second-order noise shaping”, ISSCC Digest of Technical Papers. ISSCC. 2000, 7-9 Feb. 2000, Page(s):344 - 345, 469 [11]: I. Galton, "Spectral shaping of circuit errors in digital-to-analog converters," IEEE Trans. Circuits and Systems II: Analog and Digital Signal Processing, vol. 44, no. 10, pp. 808-817, Nov., 1997.

[12]: J. Welz, I. Galton, E. Fogleman, "Simplified logic for first-order and second-order mismatch-shaping digital-to-analog converters," IEEE Trans. Circuits and Systems II: Analog and Digital Signal Processing, vol. 48, no. 11, pp. 1014-1028, Nov. 2001.

[13]: E. Fogleman, J. Welz, I. Galton, "An audio ADC Delta-Sigma modulator with 100-dB peak SINAD and 102-dB DR using a second-order mismatch-shaping DAC," IEEE J. Solid-State Circuits, vol. 36, no. 3, p.339-348, March 2001.

[14]: E.N. Aghdam, P. Benabes: “A New Mixed Stable DEM Algorithm for Bandpass Multibit Delta-Sigma ADC”, Proc. ICECS2003, pp. 962-5.

[15]: E.N. Aghdam, P. Benabes: “Higher Order Dynamic Element Matching by Shortened Tree-Structure in Delta-Sigma Modulators”, Circuit Theory and Design, 2005. Proceedings of the 2005 European Conference, Vol. 1, pp: I/201- I/204, Sept. 2005

[16]: R. Schreier, B. Zhang: “Noise-Shaped Multibit D/A Converter Employing Unit Elements”, Electronic Letters, vol.31, no.20, pp 1712-1713, Sept.28, 1995.

[17]: T. Shui et.al: “Mismatch-Shaping DAC for Lowpass and Bandpass Multi-bit Delta-Sigma Modulators, Proc. IEEE Int. Symp. Circuits and Systems, ISCAS’98, 05-98.

[18]: A. Yasuda, H. Tanimoto, T. Lida: “A 100kHz, 9.6mW Multi-bit ∆Σ DAC and ADC Using Noise Shaping Dynamic Element Matching With Tree Structure”, IEEE ISSCC Dig. of Tech. Papers, vol.41, pp.64-65, Feb.1998.

[19]: T.W. Kwan, R.W. Adams, R.Libert: “A Stereo Multibit Sigma Delta DAC with Asynchronous Master-Clock Interface”, IEEE J. Solid State Circuits, vol.31, no.12, pp. 1881-1887, Dec.1996.

[20]: R.Adams, K.Nguyen, K.Sweetland: “A 113dB SNR Oversampling DAC with Segmented Noise-Shaped Scrambling”, IEEE ISSCC Dig. of Tech Papers, vol.41, pp.62-63, Feb.1998.

[21]: J. Steensgaard-Madsen: “High Performance Data Converters”, Ph.D. thesis, Technical University of Denmark, Department of Information Technology, March 1999.

[22]: A. Fishov, E. Siragusa, J. Welz, E. Fogleman, I. Galton, "Segmented mismatch-shaping D/A conversion," Proc. of the IEEE International Symp. Circuits and Systems, May 2002.

[23]: K.L. Chan, and I. Galton, “A 14b 100MS/s DAC with fully segmented dynamic element matching,” ISSCC Dig. Tech. Papers, pp.258-259, Feb.2006.

[24]: K. Nielsen: Audio Power Amplifier Techniques With Energy Efficient Power Conversion”, Ph.D. Thesis, Technical University of Denmark, Department of Applied Electronics, April 1998.

[25]: D. Reefman, J.vd Homberg, E. v Tuilj, C. Bastiaansen, L.vd Dussen: ”A New Digital-to-Analogue Converter Design Technique for HiFi Applications”, Presented at the AES 114th Convention, Convention Paper 5846, March 2003.

[26]: T. Rueger et.al.: “A 110dB Ternary PWM Current-Mode Audio DAC with Monolithic 2Vrms Driver,” ISSCC Dig. Tech. Papers, Feb.2006.

Documents

Dynamic Element Matching