12
JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY15, 2014 1841 Adaptive Frequency-Domain Equalization in Mode-Division Multiplexing Systems Sercan ¨ O. Arık, Daulet Askarov, and Joseph M. Kahn, Fellow, IEEE Abstract—Long-haul mode-division multiplexing (MDM) em- ploys adaptive multi-input multi-output (MIMO) equalization to compensate for modal crosstalk and modal dispersion. MDM systems must typically use MIMO frequency-domain equaliza- tion (FDE) to minimize computational complexity, in contrast to polarization-division-multiplexed systems in single-mode fiber, where time-domain equalization (TDE) has low complexity and is often employed to compensate for polarization effects. We study two adaptive algorithms for MIMO FDE: least mean squares (LMS) and recursive least squares (RLS). We analyze tradeoffs between computational complexity, cyclic prefix efficiency, adap- tation time and output symbol-error ratio (SER), and the impact of channel group delay spread and fast Fourier transform (FFT) block length on these. Using FDE, computational complexity in- creases sublinearly with the number of modes, in contrast to TDE. Adaptation to an initially unknown fiber can be achieved in 3– 5 μs using RLS or 15–25 μs using LMS in fibers supporting 6–30 modes. As compared to LMS, RLS achieves faster adapta- tion, higher cyclic prefix efficiency, lower SER, and greater toler- ance to mode-dependent loss, but at the cost of higher complexity per FFT block. To ensure low computational complexity and fast adaptation in an MDM system, a low overall group delay spread is required. This is achieved here by a family of graded-index graded depressed-cladding fibers in which the uncoupled group delay spread decreases with an increasing number of modes, in concert with strong mode coupling. Index Terms—DSP complexity, MIMO, equalization, few-mode fiber, modal dispersion, mode coupling, mode-division multiplex- ing, multi-mode coherent receiver, multi-mode fiber, receiver signal processing. I. INTRODUCTION T HE continued exponential growth of data traffic has mo- tivated research on increasing capacity in long-haul trans- mission systems. Information-theoretic limits of single-mode fiber (SMF) transmission imposed by noise, fiber nonlinear- ity and dispersion are being approached [1], [2] and compen- sation of these combined effects requires very high computa- tional complexity [3], [4]. Increasing per-fiber capacity can be achieved more readily by increasing spatial dimensionality us- ing multi-core fiber [5], [6] or multi-mode fiber (MMF) [5], [7] with multi-input multi-output (MIMO) transmission [8], [9]. Manuscript received July 21, 2013; revised November 14, 2013 and January 15, 2014; accepted January 23, 2014. Date of publication January 27, 2014; date of current version April 27, 2014. This work was supported by National Science Foundation under Grant ECCS-1101905, Corning, Inc., and a Stanford Graduate Fellowship. The authors are with the Edward L. Ginzton Laboratory, Department of Electrical Engineering, Stanford University, Stanford, CA 94305 USA (e-mail: [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JLT.2014.2303079 Throughout this paper, D denotes the total number of dimen- sions available for multiplexing, including spatial and polariza- tion degrees of freedom. Long-haul transmission systems already employ multiplex- ing in the two polarization modes of SMF (D = 2). This is enabled by coherent detection and digital equalization [10], [11], which can compensate for chromatic dispersion (CD) and polarization-mode dispersion (PMD). Most such systems em- ploy a hybrid equalization approach. CD, which is essentially fixed but has a long impulse response duration, is compen- sated by programmable (but not adaptive) frequency-domain equalization (FDE). By contrast, PMD, which can vary on a microsecond time scale [12], [13] but has a very short im- pulse response duration, is often compensated by adaptive 2 × 2 MIMO time-domain equalization (TDE) [14]–[18]. Filter taps are updated using either blind or data-aided methods such as the least mean squares (LMS) algorithm, typically using training sequences for initial adaptation and decision-directed updates afterwards [10], [17]. In systems using mode-division multiplexing (MDM) in MMFs (D> 2), receiver computational complexity increases because of an increase in D and because of the large group delay (GD) spread from modal dispersion (MD). Two approaches for minimizing GD spread and controlling receiver complexity are optimization of the fiber index profile and introduction of strong mode coupling [19]. For two mode groups (D = 6), fibers with very low uncoupled GD spread can be realized by choosing a core radius at which the GD-versus-radius curves for the two mode groups intersect [20]. For more than two mode groups (D> 6), this approach is not viable, since the curves for dif- ferent pairs of modes intersect at different radii, and step-index fibers have prohibitively high GD spreads [21]. On the other hand, graded-index fibers with large cores (D →∞) have very low GD spreads [22], but for small D, graded-index fibers have prohibitively high GD spreads [21]. With any fiber design, in the absence of mode coupling, the GD spread accumulates linearly with fiber length, while strong mode coupling causes the GD spread to accumulate with the square-root of fiber length [19]. Nevertheless, even assuming low uncoupled GD spread and strong mode coupling, the GD spread caused by MD in MMF is far larger than that caused by PMD in SMF [23]. Most MDM demonstrations to date have used signal pro- cessing approaches generalized from SMF systems. Experi- ments [7], [24], [25] with two mode groups have employed 6 × 6 MIMO TDE adapted using data-aided methods, such as LMS [26], and implemented in non-real-time computation. In MDM, however, the computational complexity of MIMO TDE is known to be prohibitively high [19], [27], [28]. Computational 0733-8724 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY 15, 2014 1841

Adaptive Frequency-Domain Equalization inMode-Division Multiplexing Systems

Sercan O. Arık, Daulet Askarov, and Joseph M. Kahn, Fellow, IEEE

Abstract—Long-haul mode-division multiplexing (MDM) em-ploys adaptive multi-input multi-output (MIMO) equalization tocompensate for modal crosstalk and modal dispersion. MDMsystems must typically use MIMO frequency-domain equaliza-tion (FDE) to minimize computational complexity, in contrastto polarization-division-multiplexed systems in single-mode fiber,where time-domain equalization (TDE) has low complexity and isoften employed to compensate for polarization effects. We studytwo adaptive algorithms for MIMO FDE: least mean squares(LMS) and recursive least squares (RLS). We analyze tradeoffsbetween computational complexity, cyclic prefix efficiency, adap-tation time and output symbol-error ratio (SER), and the impactof channel group delay spread and fast Fourier transform (FFT)block length on these. Using FDE, computational complexity in-creases sublinearly with the number of modes, in contrast to TDE.Adaptation to an initially unknown fiber can be achieved in ∼3–5 μs using RLS or ∼15–25 μs using LMS in fibers supporting6–30 modes. As compared to LMS, RLS achieves faster adapta-tion, higher cyclic prefix efficiency, lower SER, and greater toler-ance to mode-dependent loss, but at the cost of higher complexityper FFT block. To ensure low computational complexity and fastadaptation in an MDM system, a low overall group delay spreadis required. This is achieved here by a family of graded-indexgraded depressed-cladding fibers in which the uncoupled groupdelay spread decreases with an increasing number of modes, inconcert with strong mode coupling.

Index Terms—DSP complexity, MIMO, equalization, few-modefiber, modal dispersion, mode coupling, mode-division multiplex-ing, multi-mode coherent receiver, multi-mode fiber, receiver signalprocessing.

I. INTRODUCTION

THE continued exponential growth of data traffic has mo-tivated research on increasing capacity in long-haul trans-

mission systems. Information-theoretic limits of single-modefiber (SMF) transmission imposed by noise, fiber nonlinear-ity and dispersion are being approached [1], [2] and compen-sation of these combined effects requires very high computa-tional complexity [3], [4]. Increasing per-fiber capacity can beachieved more readily by increasing spatial dimensionality us-ing multi-core fiber [5], [6] or multi-mode fiber (MMF) [5], [7]with multi-input multi-output (MIMO) transmission [8], [9].

Manuscript received July 21, 2013; revised November 14, 2013 and January15, 2014; accepted January 23, 2014. Date of publication January 27, 2014;date of current version April 27, 2014. This work was supported by NationalScience Foundation under Grant ECCS-1101905, Corning, Inc., and a StanfordGraduate Fellowship.

The authors are with the Edward L. Ginzton Laboratory, Department ofElectrical Engineering, Stanford University, Stanford, CA 94305 USA (e-mail:[email protected]; [email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JLT.2014.2303079

Throughout this paper, D denotes the total number of dimen-sions available for multiplexing, including spatial and polariza-tion degrees of freedom.

Long-haul transmission systems already employ multiplex-ing in the two polarization modes of SMF (D = 2). This isenabled by coherent detection and digital equalization [10],[11], which can compensate for chromatic dispersion (CD) andpolarization-mode dispersion (PMD). Most such systems em-ploy a hybrid equalization approach. CD, which is essentiallyfixed but has a long impulse response duration, is compen-sated by programmable (but not adaptive) frequency-domainequalization (FDE). By contrast, PMD, which can vary on amicrosecond time scale [12], [13] but has a very short im-pulse response duration, is often compensated by adaptive 2 × 2MIMO time-domain equalization (TDE) [14]–[18]. Filter tapsare updated using either blind or data-aided methods such as theleast mean squares (LMS) algorithm, typically using trainingsequences for initial adaptation and decision-directed updatesafterwards [10], [17].

In systems using mode-division multiplexing (MDM) inMMFs (D > 2), receiver computational complexity increasesbecause of an increase in D and because of the large group delay(GD) spread from modal dispersion (MD). Two approaches forminimizing GD spread and controlling receiver complexity areoptimization of the fiber index profile and introduction of strongmode coupling [19]. For two mode groups (D = 6), fibers withvery low uncoupled GD spread can be realized by choosing acore radius at which the GD-versus-radius curves for the twomode groups intersect [20]. For more than two mode groups(D > 6), this approach is not viable, since the curves for dif-ferent pairs of modes intersect at different radii, and step-indexfibers have prohibitively high GD spreads [21]. On the otherhand, graded-index fibers with large cores (D → ∞) have verylow GD spreads [22], but for small D, graded-index fibers haveprohibitively high GD spreads [21]. With any fiber design, in theabsence of mode coupling, the GD spread accumulates linearlywith fiber length, while strong mode coupling causes the GDspread to accumulate with the square-root of fiber length [19].Nevertheless, even assuming low uncoupled GD spread andstrong mode coupling, the GD spread caused by MD in MMFis far larger than that caused by PMD in SMF [23].

Most MDM demonstrations to date have used signal pro-cessing approaches generalized from SMF systems. Experi-ments [7], [24], [25] with two mode groups have employed6 × 6 MIMO TDE adapted using data-aided methods, such asLMS [26], and implemented in non-real-time computation. InMDM, however, the computational complexity of MIMO TDEis known to be prohibitively high [19], [27], [28]. Computational

0733-8724 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Page 2: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

1842 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY 15, 2014

complexity per symbol scales linearly in D and in the GD spreadfrom MD, while total hardware complexity scales quadraticallyin those parameters [19]. Moreover, blind equalization methodscommonly used in SMF systems can perform poorly in MDMsystems with MD and mode coupling [29].

Because of large values of D and large GD spreads, MDMsystems ultimately must use different signal processing ap-proaches than SMF systems to ensure tolerable complexity. Asshown in [19], [27], and [30], MIMO FDE can reduce com-plexity, enabling the computational complexity per symbol toscale sublinearly in D and in the GD spread from MD (assum-ing a known channel), while hardware complexity still scalessuperlinearly in those parameters [19]. In long-haul MDM sys-tems, mechanical or acoustic perturbations of installed MMFsare expected to cause mode coupling variations on time scalesas short as tens of μs [31], similar to polarization coupling vari-ations in SMF [12], [13]. These temporal variations demandadaptive MIMO FDEs that achieve fast adaptation, low out-put symbol-error ratio (SER), high throughput efficiency, andtolerable computational complexity.

Adaptive FDE techniques have been studied for wide-band single-carrier MIMO wireless communications [32]–[35].While those techniques may be relevant for MDM systems,MDM systems exhibit several important differences affectingthe performance and feasibility of candidate techniques [36].MDM systems have far higher GD spreads than typical wirelessMIMO systems, when measured in units of symbol intervals.MDM MIMO channels may be near-unitary if mode-dependentloss and gain (MDL) is well-controlled [37], in contrast to wire-less systems, where multipath fading typically causes MIMOchannels to be far from unitary. MDM channels have sym-bol rates of tens of Gbaud, favoring simple techniques suchas linear equalization, in contrast to wireless MIMO, wheremore complex nonlinear equalization techniques can be consid-ered [38], [39].

In this paper, we study adaptive MIMO FDE for long-haulMDM systems using two candidate algorithms: LMS and recur-sive least squares (RLS). We study the tradeoffs between com-putational complexity, throughput efficiency, adaptation speedand output SER, and the impact of the fast Fourier transform(FFT) block length on them. For D ranging from 6 to 30, whileeither RLS or LMS can adapt to an unknown fiber within mi-croseconds; RLS achieves faster adaptation, higher throughputefficiency, lower output SER, and greater tolerance to MDL,at the cost of moderately higher computational complexity perFFT block. Apart from adaptive algorithm design, low complex-ity and fast adaptation require the MDM system to have low GDspread. To this end, we use a family of graded-index MMFs inwhich uncoupled GD spread decreases with an increasing num-ber of modes. Some results from this paper are summarized in atutorial [36] on MDM systems, channel models and signal pro-cessing architectures. This paper presents a much more detailedperformance and complexity analysis than [36].

The remainder of this paper is as follows. Section II presentsa linear MIMO channel model for MDM systems and describesMIMO FDE. Section III reviews the LMS and RLS algorithmsfor FDE and their important properties. Section IV describes anexemplary long-haul MDM system, including a new MMF de-

Fig. 1. Multi-section model of long-haul MDM system.

sign optimized for low GD spread, and presents numerical sim-ulations evaluating the complexity and performance of adaptiveMIMO FDE methods. Sections V and VI provide discussionand conclusions, respectively.

II. MDM SYSTEM MODEL

A. Multi-Section Propagation Model

In order to study the effect of mode coupling on system per-formance and complexity, we model a long-haul MDM systemas the concatenation of numerous short sections, each slightlylonger than the length over which complex baseband modalfields remain correlated [28]. By decreasing the section length,thus increasing the number of sections, we can increase thestrength of mode coupling. Throughout this paper, we assumethe strong-coupling regime, so the number of independent sec-tions is large compared to unity.

As shown in Fig. 1, the system is composed of Kamp spans,each comprising a fiber of length Lamp , followed by an amplifierto compensate for the mode-averaged loss of the fiber. Eachspan is subdivided into Ksec sections, each of length Lsec . Theoverall system has Ktot = KampKsec sections and total lengthLtot = KampLamp = KampKsecLsec .

Assuming a MMF supporting D orthogonal propagatingmodes, excluding nonlinearity and noise, we represent the prop-agation operator as a D × D matrix multiplying complex base-band modal envelopes at frequency Ω [28]:

Mtot(Ω) = exp(− j

2Ω2 β2Ltot

)· MMD(Ω). (1)

The exponential factor represents mode-averaged propaga-tion, where β2 is the mode-averaged CD per unit length (forsimplicity, we neglect the mode-averaged GD). The D × Dmatrix MMD(Ω) represents mode-dependent effects, includingMDL, MD and mode coupling (for simplicity, we neglect mode-dependent CD). It is written as a product of factors for each ofthe Kamp spans:

MMD(Ω) =K a m p∏k=1

M(k)MD(Ω), (2)

where M(k)MD(Ω) represents mode-dependent effects in the kth

span. This is given by

M(k)MD(Ω) = diag

[exp

(g

(k)1

2

). . . exp

(g

(k)D

2

) ]

·K s e c∏l=1

V(k,l)Λ(Ω)U(k,l)H

, (3)

Page 3: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

ARIK et al.: ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION IN MODE-DIVISION MULTIPLEXING SYSTEMS 1843

Fig. 2. Complex baseband model of MDM transmission with MIMOequalizer.

where H denotes Hermitian conjugate. The V(k,l) and U(k,l) arefrequency-independent random unitary matrices representingmode coupling in the lth section of the kth span.

We assume MDL from the transmission fibers is negligiblecompared to that from the optical amplifiers [40]–[42]. Each am-plifier has uncoupled modal gains g

(k)i , i = 1, . . . , D, measured

in dB or log-power-gain units. They satisfy g(k)1 + · · · + g

(k)D =

0 and have root-mean-square (rms) value σg . In the strong-coupling regime with Kamp >> 1 independent MDL sources,the statistics of coupled MDL are determined by the rms accu-mulated MDL ξ =

√Kampσg [43].

In (3), Λ(Ω) represents the uncoupled MD in one section,which is assumed to be the same in all sections. It is given by

Λ(Ω) = diag[exp (−jΩτ1) . . . exp (−jΩτD )

], (4)

where τi, i = 1, . . . , D are the uncoupled modal GDs,which satisfy τ1 + · · · + τD = 0. They have rms value στ =Δβ1,rmsLsec , where Δβ1,rms is the rms uncoupled MD perunit length of the MMF. In the strong-coupling regime withKtot >> 1 independent sections, the statistics of the coupledGDs are determined by the rms coupled GD σgd =

√Ktotστ =

Δβ1,rms√

LsecLtot [44].

B. Discrete-Time System Model

Neglecting nonlinearity and assuming digital coherent detec-tion with perfect carrier recovery, an MDM transmission sys-tem can be described by the complex baseband system shownin Fig. 2. The input in the jth mode in the nth symbol inter-val, xj [n], is a complex-valued symbol from a particular con-stellation. Transmitter pulse shaping and electrical-to-opticalconversion is represented by b (t). Linear propagation is de-scribed by a D × D channel impulse response matrix Mtot (t),which is the inverse Fourier transform (FT) of (1), in whichmdj (t) is the impulse response between the dth input modeand the jth output mode. The dominant noise source is am-plified spontaneous emission from inline amplifiers [1]. Theadditive noises nj (t) , j = 1, . . . , D are modeled as spectrallywhite with power spectral density N0/2 over the signal band-width [1], and can be modeled accurately as spatially white,provided Kamp is sufficiently large [43]. After the addition ofnoise, optical-to-electrical conversion and electrical filtering areperformed, represented by p (t). We define an overall impulseresponse between the dth input mode and the jth output mode

qdj (t) = b (t) ∗ mdj (t) ∗ p (t) (5)

and a filtered noise n′j (t) = p (t) ∗ nj (t).

The output of each filter is sampled at rate ros/Ts , whereTs is the symbol duration and ros is the receiver oversampling

ratio [15]. The output in the jth mode at the mth samplinginstant t = mTs/ros is given by

yj [m] =∑

n

D∑d=1

xj [n] qdj

(mTs

ros− nTs

)+ n′

j

(mTs

ros

).

(6)The discrete-time Fourier transforms (DTFTs) of the output

sample sequences yj [m] , j = 1, . . . , D, can be expressed as aD × 1 vector⎡⎢⎢⎢⎢⎢⎢⎣

∑m

y1 [m] e−j ω m

...∑m

yD [m] e−j ω m

⎤⎥⎥⎥⎥⎥⎥⎦

=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

∑l

Y1

((ω − 2πl) ro s

Ts

)

...∑l

YD

((ω − 2πl) ro s

Ts

)

⎤⎥⎥⎥⎥⎥⎥⎥⎦≈

⎡⎢⎢⎢⎢⎢⎣

Y1

(ωro s

Ts

)

...

YD

(ωro s

Ts

)

⎤⎥⎥⎥⎥⎥⎦

.

(7)

The components in the rightmost term of (7) are given by

⎡⎢⎢⎣

Y1 (Ω)

...

YD (Ω)

⎤⎥⎥⎦ = B(Ω)Mtot (Ω)P (Ω)

⎡⎢⎢⎢⎣

X1(ejT s Ω

)...

XD

(ejT s Ω

)

⎤⎥⎥⎥⎦ +

⎡⎢⎢⎣

N ′1 (Ω)

...

N ′D (Ω)

⎤⎥⎥⎦ ,

(8)

where B(Ω),Mtot(Ω), P (Ω), N ′j (Ω) are the FTs of

b (t) ,Mtot (t) , p (t) , n′j (t), respectively, and Xj

(ejω

)is

the DTFT of xj [n]. In computing the FT of n′j (t) and the

DTFT of xj [n], we consider any particular finite-length real-izations of these random processes to be finite-energy signals.The approximation in (7) assumes P (Ω) blocks all componentsabove the Nyquist frequency ros/2Ts , preventing aliasing.

C. Frequency-Domain Equalization

The optimization and complexity of non-adaptive FDEs forMDM, assuming the channel is known a priori, were ad-dressed in [19]. An equalizer for CD [14], [19] must ac-

commodate a delay spread of NCD =⌈2π

∣∣β2∣∣ Ltot (rosRs)

2⌉

samples, where Rs = 1/Ts is the symbol rate and �xdenotes the ceiling function. Assuming strong modecoupling, an equalizer for MD [19] must accommo-date a delay spread of NMD =

⌈√Ktotστ uD (p) rosRs

⌉=⌈

Δβ1,rms√

LsecLtotuD (p) rosRs

⌉samples, where uD (p) is

defined such that σgduD (p) is no shorter than the coupledGD spread with probability 1 − p. For typical values of D andp ∼ 10−4 to 10−6 , uD (p) ∼ 4 to 5 [19], [45]. A combinedequalizer for CD and MD must accommodate a delay spread ofNCD + NMD samples.

The MDM channel corresponds to linear convolution ofarbitrary-length sequences in the time domain (6) or multi-plication of DTFTs in the continuous frequency domain (7)and (8). Efficient realization of an FDE relies on using the dis-crete Fourier transform (DFT), implemented by an FFT of blocklength NFFT , for conversion between time and frequency, whilebeing able to represent the MDM channel as circular convolutionof finite-length sequences in the time domain, corresponding tomultiplication in the discrete frequency domain.

Page 4: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

1844 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY 15, 2014

There are two well-known approaches for enabling FFT-basedFDEs.

One approach is to use block convolution, e.g., overlap-saveconvolution, as in [46], [47]. This avoids any overhead asso-ciated with a cyclic prefix, but complicates realization of anadaptive FDE. A constrained adaptive FDE requires additionalFFTs to enforce time-domain gradient constraints [46]. An un-constrained adaptive FDE avoids these additional FFTs, butexhibits slower adaptation and higher excess error [46].

A second approach, adopted here, is to prepend a cyclic prefixof length NCP to each block of NFFT/ros symbols beforetransmission, as in systems using orthogonal frequency-divisionmultiplexing [48]. At the receiver, the first NCP samples ofeach received block of NFFT + NCP samples are discarded,and the remaining NFFT samples can be processed independentof other blocks. The cyclic prefix length NCP must be no shorterthan the channel delay spread (NCD , NMD or NCD + NMD).A drawback of this approach is a reduction of throughput andaverage-power efficiency, which is quantified by a cyclic prefixefficiency parameter

ηCP =NFFT

NFFT + NCP. (9)

Given a channel delay spread defining NCP , efficiency is max-imized by choosing NFFT >> NCP . Since the cyclic prefixsimplifies realization of an adaptive FDE [46], this approachhas become popular in wireless systems [32]–[34].

The MDM channel frequency-domain relationship when us-ing a cyclic prefix is obtained by sampling (8) at NFFT equallyspaced frequencies Ω = 2πros (k − NFFT/2) /NFFTTs, k =0, ..., NFFT − 1⎡

⎢⎣Y1 [k]

...

YD [k]

⎤⎥⎦ = B

(2πros (k − NFFT/2)

NFFTTs

)

· Mtot

(2πros (k − NFFT/2)

NFFTTs

)

· P(

2πros (k − NFFT/2)NFFTTs

)

·

⎡⎢⎣

X1 [k]...

XD [k]

⎤⎥⎦ +

⎡⎢⎣

N ′1 [k]...

N ′D [k]

⎤⎥⎦

= Q [k] ·

⎡⎢⎣

X1 [k]...

XD [k]

⎤⎥⎦ +

⎡⎢⎣

N ′1 [k]...

N ′D [k]

⎤⎥⎦ . (10)

The components of (Y1 [k] , . . . , YD [k])T , (X1 [k], . . . , XD

[k])T and (N ′1 [k] , . . . , N ′

D [k])T are NFFT -point DFTs ofblocks of the time-domain signals in (6).

Considering a non-adaptive FDE [19], computational com-plexity is minimized by using a single D × D matrix FDEWtot [k] to compensate for the CD, MD and other mode-dependent effects described byMtot(Ω), which requires a prefixlength NCD + NMD . However, to facilitate realization of a fast-

adapting FDE for MD, we compensate CD by a bank of D staticscalar equalizers, each denoted by WCD [k]. The notation im-plies they are FDEs, but they could be realized using any of themethods in [17], [49]–[51]. They are followed by an adaptiveD × D matrix FDE WMD [k] to compensate for the MD (andother mode-dependent effects) described by MMD(Ω), whichrequires cyclic prefix length NMD . As shown in Fig. 2, the cas-cade of the CD and MD equalizers is equivalent to a combinedequalizer Wtot [k] = WCD [k]WMD . The equalizer output isgiven by

⎡⎢⎢⎣

X1 [k]

...

XD [k]

⎤⎥⎥⎦=Wtot [k]

⎡⎢⎢⎣

Y1 [k]

...

YD [k]

⎤⎥⎥⎦=WMD [k]

⎡⎢⎢⎣

Y ′1 [k]

...

Y ′D [k]

⎤⎥⎥⎦, (11)

where [Y ′1 [k] , . . . , Y ′

D [k]]T represents the outputs of the CDequalizers.

III. ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION

At each discrete frequency k = 0, ..., NFFT − 1, the errorsignal is given by

e [k] = x [k] − WMD [k] y [k]

= x [k] − WMD [k] (Q [k] x [k] + n′ [k]) (12)

where x [k] = [X1 [k] . . . XD [k]]T represents a block of knownor estimated data symbols, y [k] = [Y ′

1 [k] . . . Y ′D [k]]T rep-

resents a block of samples at the CD equalizer outputs,n′ [k] = [N ′

1 [k] . . . N ′D [k]]T represents the filtered noise and

Q [k] is the overall channel transfer function matrix from(10). Minimization of the total frequency-domain signal error∑NF F T −1

k=0 E{e [k]H e [k]} is equivalent to separately minimiz-ing the error terms at each frequency, as the equalizer matricesWMD [k] can be optimized independently at each k.

For a known channel, E{e [k]H e [k]

}is minimized by the

linear minimum mean square error (MMSE) filter [52]:

WMD [k]=(Q[k]H Rn ′ [k]−1Q[k]+

IPx

)−1

Q [k]H Rn ′ [k]−1

(13)where Px = E{|xj [n]|2} is the average transmitted powerin each mode, I is a D × D identity matrix and Rn ′ [k] =E{n′ [k] n′ [k]H } is the autocorrelation matrix of filtered noiseat discrete frequency k.

For an unknown channel, since Q [k] are not known a prioriat the equalizer, the WMD [k] must be computed iteratively.Here, we consider equalizer adaptation using the LMS and RLSalgorithms.

LMS is a stochastic gradient descent minimization using in-stantaneous estimates of the error e [k] [46], [53] described byan update equation

WMD [k] ← WMD [k] + (x [k] − WMD [k] y [k]) y [k]H μ.(14)

The convergence rate and performance of LMS depends onthe scalar step size μ, which for convergence must satisfy

Page 5: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

ARIK et al.: ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION IN MODE-DIVISION MULTIPLEXING SYSTEMS 1845

TABLE ICOMPUTATIONAL COMPLEXITY PER BLOCK OF LENGTH NFFT , WHICH

CONVEYS D · NFFT /ros SYMBOLS

0 < μ < 2/λmax , where λmax is the largest eigenvalue of the au-

tocorrelation matrix of y [k] , Ry [k] = E{y [k] y [k]H

}[46].

RLS involves iterative minimization of an exponentiallyweighted cost function, treating the minimization problem asdeterministic [46], [53]. It is described by update equations

WMD [k] ← WMD [k]

+ (x [k] − WMD [k] y [k]) y [k]H(R [k] κ−1) (15)

R [k] ←(R [k] κ−1) −

(R [k] κ−1

)y [k] y [k]H

(R [k] κ−1

)1 + y [k]H (R [k] κ−1) y [k]

.

(16)

Here, R [k] is a D × D tracked inverse time-averaged weightedcorrelation matrix at frequency k [32], [34], which is initializedwith the identity matrix times a large positive number, and κis a forgetting factor satisfying 0 << κ < 1. The version of theRLS algorithm given above uses the matrix inversion lemma in(16) for lower-complexity computation of R [k], which becomesmore important for higher values of D.

Assuming ntr blocks of known or estimated symbols arerequired for training until convergence, and including the cyclicprefix, the total time required to adapt an FDE is

Tadapt =ntr (NFFT + NCP) Ts

ros. (17)

Typically Ts, ros and NCP = NMD are given system parame-ters. The parameter ntr is determined mainly by the algorithmand convergence criteria. Hence, NFFT is the major parameterthat may be decreased to minimize the training time Tadapt .However, given a prefix length NCP , decreasing NFFT reducesthe cyclic prefix efficiency (9). In Section IV, we will see thatit can be necessary to reduce NMD in order to minimize Tadaptwhile maintaining high efficiency.

As rough measures of implementation complexities, we eval-uate the number of complex multiplications and additions re-quired for the LMS and RLS algorithms. Table I gives the com-putational complexities of the algorithms for a training blockof length NFFT , considering the steps (13)–(16) (see AppendixA for the details of the derivations). To evaluate the total com-putational complexity to adapt an FDE, all the complexities inTable I should be multiplied by the number of training blocksntr .

Fig. 3. Optimized fiber index profile for D = 12 modes.

The computational complexity per data-bearing symbol is ob-tained by dividing the complexities in Table I by the number ofsymbols per block, D · NFFT/ros . The per-symbol complexi-ties scale at most linearly with D. They depend on NFFT onlythrough the DFT and inverse DFT operations, which favor thechoice of small NFFT to minimize complexity. For values ofD large relative to log2 (NFFT), the complexity of adaptationand equalization dominates over the complexity of DFT/inverseDFT operations, and the dependence of complexity on NFFT isreduced.

IV. FDE COMPLEXITY AND PERFORMANCE EXAMPLES

In this section, we study the complexity, adaptation time,and SER performance of MIMO FDEs using LMS and RLSalgorithms. The system end-to-end GD spread has a major effecton complexity and performance, so we propose a novel fiberdesign having very low uncoupled MD.

A. Multimode Fiber Design

Managing an MDM system’s end-to-end GD spread is impor-tant in controlling the computational and hardware complexityof the MIMO FDE [19], [28] and, as shown here, in achievingfast adaptation of the MIMO FDE. Here, we propose to usefibers with low uncoupled GD spread and rely on strong modecoupling, induced by splices or other perturbations, to furtherreduce the GD spread [19], [20].

We consider a family of “graded-index graded depressed-cladding” (GIGDC) index profiles, inspired by [54]–[56], inwhich a parabolic core index profile is extended smoothly into adepressed cladding. Fig. 3 shows a profile optimized for D = 12modes, and Table II gives parameters for fibers supporting D =6, 12, 20 and 30 modes. These have been computed by numericalsolution of the vector wave equation, without assumption ofweak guidance [21]. For each value of D, the numerical apertureis NA = 0.150, and the core radius is chosen so the number ofpropagating modes is exactly D over the C band, and increasesat wavelengths just below 1530 nm. This approach optimizesconfinement of propagating modes, minimizing bending losses,mode-dependent losses and mode-dependent CD. As includedin Table II, the mode-averaged effective areas scale as D0.78 andminimum modal effective areas scale as D0.43 . Most relevanthere, the rms MD Δβ1,rms tends to decrease as D increasesfrom 6 to 30. This behavior, while perhaps counterintuitive,may be justified by observing that as the core radius is increased

Page 6: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

1846 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY 15, 2014

TABLE IIUNCOUPLED PARAMETERS OF GRADED-INDEX

GRADED-DEPRESSED-CLADDING FIBERS FOR DIFFERENT NUMBERS OF

MODES, ASSUMING NA = 0.150 AND λ = 1550 nm

to support more modes, the index profile “seen” by the modesincreasingly resembles an infinite parabola, which is ideallyfree of MD to first-order [22]. For GIGDC fibers, the mode-dependent CD has an rms value of only 3% of the mode-averagedCD β2 , justifying our neglect of mode-dependent CD in (1).Low mode-dependent CD ensures that a low GD spread can beachieved over a wide range of wavelengths.

B. Transmission System

We consider a long-haul fiber system described by the multi-section model of Section II-A (see Fig. 1). The system hasKamp = 20 spans, each of length Lamp = 100 km, for a totallength Ltot = 2000 km. The family of GIGDC fibers with un-coupled parameters given in Table II is assumed. To vary thestrength of mode coupling, the number of sections per spanKsec is varied from 1 to 100, corresponding to a section lengthLsec from 100 to 1 km, and an rms end-to-end GD spreadσgd = Δβ1,rms

√LsecLtot . When MDL is present, the uncou-

pled modal gains in each span are uniformly distributed with rmsvalue σg , resulting in an rms accumulated MDL ξ =

√Kampσg ,

measured in decibal.MDM transmission is described by the system model of Sec-

tion II-B. A symbol rate Rs = 1/Ts = 32 Gbaud and a receiveroversampling rate ros = 2 are assumed. The transmitter pulseshape b (t) is a rectangular pulse of duration Ts filtered by a fifth-order Bessel lowpass filter with−3-dB cutoff frequency 0.8/Ts ,while the receiver filter p (t) is a fifth-order Butterworth low-pass filter with −3-dB cutoff frequency 0.4ros/Ts [15]. Trans-mitted symbols are drawn from a quadrature phase-shift keyingconstellation with average power Px . The SNR as the trans-mitted signal power over received noise per mode is definedas SNR = Px/σ2

N , where σ2N = N0Rsros . We initially neglect

phase noise and discuss its effects in Section V.

C. Computational Complexity and Cyclic Prefix Efficiency

We first study the computational complexity and cyclic prefixefficiency of the adaptive MIMO FDE. These depend stronglyon the MD delay spread NMD , which we compute as describedin Section II-C, assuming p = 10−5 and neglecting MDL (ξ =

TABLE IIIIMPULSE RESPONSE DURATIONS (IN SAMPLES) FOR DIFFERENT SECTION

LENGTHS, ASSUMING Ltot = 2000 km, Rs = 32 Gbaud, ros = 2, p = 10−5

AND ξ = 0 dB

TABLE IVMIMO EQUALIZER: CYCLIC PREFIX LENGTHS, BLOCK LENGTHS, CYCLIC

PREFIX EFFICIENCIES AND COMPUTATIONAL COMPLEXITIES, ASSUMING THE

SAME PARAMETERS AS TABLE III AND Lsec = 1 km

0 dB). The presence of significant MDL might change NMDslightly, as studied for D = 2 [57].

Table III gives values of NMD for D = 6, 12, 20 and 30modes for section lengths Lsec = 100, 10 and 1 km, comparingthese to the CD delay spread NCD . In order for the computa-tional complexity of an adaptive FDE for MD to be roughlycomparable to that for CD, a system should be designed so thatNMD is several times smaller than NCD . This requirement canbe traced to the adaptivity and increased dimensionality of theD × D matrix FDE for compensating MD, as compared to theD static scalar FDEs for compensating CD. Although an arbi-trarily small section length is possible in principle, we believe arealistic target is Lsec = 1 km, and this value is assumed here-after. For this choice, NMD decreases from 477 to 131 (2.8 to10.5 times smaller than respective values of NCD) as D rangesfrom 6 to 30.

Given these values of NMD , MIMO FDEs are designed usingthe parameters given in Table IV. Prefix lengths are chosen tosatisfy NCP = NMD . Block lengths NFFT varying from 211 to29 as D ranges from 6 to 30 are chosen to ensure fast adapta-tion (see below). Corresponding values of the prefix efficiencyηCP , given by (9), are roughly 80% for all values of D. As ex-plained later, RLS adaptation may be sufficiently fast to permitan increase in NFFT , which increases ηCP .

Page 7: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

ARIK et al.: ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION IN MODE-DIVISION MULTIPLEXING SYSTEMS 1847

Fig. 4. Average symbol-error ratio versus number of training blocks for dif-ferent values of μ for LMS-adapted FDE, assuming D = 12 modes, SNR =10.5 dB, ξ = 0 dB and D = 12 modes.

Table IV gives values of computational complexity per datasymbol for MIMO FDE. “Known channel” is the complexity ofequalization only, corresponding to the twice the first line plusthe second line of Table I. “LMS-adapted” or “RLS-adapted”is the complexity of both equalization and adaptation, and alsoincludes the third or fourth line of Table I. In Table IV, eachof the above has been divided by D · NFFT/ros , the number ofsymbols per block.

As compared to equalizing a known channel, LMS adaptationincreases the required complex multiplications per symbol by1.4 to 1.8 and the required complex additions per symbol by1.3 to 1.7 as D ranges from 6 to 30. RLS adaptation increasesthe required complex multiplications per symbol by 2.9 to 4.9and the required complex additions per symbol by 1.9 to 3.6over the same range of D. The complexities of LMS and RLSadaptation are compared further in Section V.

For comparison, static overlap-save based FDE of CD [19]requires about 33 complex multiplications per symbol and about61 complex additions per symbol for all values of D, assuminga block length NFFT = 214 .

D. Adaptation Time and SER Performance

We now study the adaptation time and SER performance ofthe MIMO FDE and their dependence on SNR, MDL and thenumber of modes. We present values of the SER as a function ofthe number of training blocks ntr obtained by averaging Monte-Carlo simulations over random channel and symbol sequencerealizations. We continue to assume a section length Lsec =1 km. For the RLS algorithm, we choose a forgetting factorκ = 0.999, whose optimization has negligible effect on boththe converged SER value and the adaptation time. For the LMSalgorithm, the choice of the step size μ is critical. As shown inFig. 4, small values of μ result in impractically slow adaptationtimes whereas large values of μ yield higher asymptotic SERsand can cause the LMS algorithm to violate the convergencecondition discussed in Section III. Considering this trade-off,for MDL-free channels, we choose an optimal value of μ = 1.5× 10−5 . In the presence of MDL, since the condition number

Fig. 5. Average symbol-error ratio versus number of training blocks for SNR= 6.5, 8.5, 10.5 and 12.5 dB with MMSE filtering (dotted lines), RLS (solidlines) and LMS (dashed lines) algorithms assuming κ = 0.999, μ = 1.5× 10−5 ,ξ = 0 dB and D = 12 modes.

Fig. 6. Average symbol-error ratio versus number of training blocks for ac-cumulated MDL ξ = 0, 1.36, 2.72, 4.08, 5.43 and 6.79 dB with RLS algorithm(solid lines) with κ = 0.999 and LMS algorithm (dashed lines) with μ = 1.5× 10−5 , 1.3 × 10−5 , 1.1 × 10−5 , 9.3 × 10−6 , 5.1 × 10−6 and 3.1 × 10−6

respectively, assuming SNR = 10.5 dB and D = 12 modes.

of the MDM channel matrix MMD(Ω) is increased [43], μ isreduced to optimize the convergence rate.

First we fix the number of modes to D = 12 and studyadaptation in different regimes of SNR and MDL.

Fig. 5 shows SER versus ntr achieved by RLS or LMS at dif-ferent values of SNR per mode, assuming no MDL (ξ = 0 dB).As expected, RLS converges faster, and to a lower asymptoticSER, than LMS. The logarithm of the asymptotic SER scalesroughly as SNR−1.5 for RLS and as SNR−1.3 for LMS, lead-ing to an increasing disparity in their asymptotic SERs withincreasing SNR. As the SNR increases, the time required forconvergence to the asymptotic SER increases slowly for RLS,and more rapidly for LMS. As a comparison case, performancebound of the MMSE filter given by (13) is also shown in Fig. 5.The asymptotic SER of RLS approaches that of the MMSEequalizer much faster than that of LMS as the SNR increases.Over the range of SNRs considered, the SERs for using RLSand LMS are, respectively, ∼2–6 times and ∼6–10 times higherthan those for the MMSE equalizer.

Page 8: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

1848 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY 15, 2014

Fig. 7. Average symbol-error ratio versus number of training blocks for D =6, 12, 20 and 30 modes with RLS (solid line) and LMS (dashed line) algorithmsassuming κ = 0.999, μ = 1.5 × 10−5 , ξ = 0 dB and SNR = 10.5 dB.

Fig. 6 shows the SER versus ntr achieved by RLS or LMS atdifferent values of the accumulated MDL ξ, assuming an SNR of10.5 dB. As expected, RLS is more robust than LMS, convergingfaster to a lower asymptotic SER. The convergence rate of RLSis not adversely affected for the values of MDL considered,and the logarithm of the asymptotic SER increases roughlyproportional to ξ. On the other hand, increasing MDL causesthe convergence of LMS to slow down and causes the logarithmof the asymptotic SER to increase roughly proportional to ξ, butat a higher rate than for RLS. Note that for LMS, the step sizeμ is optimized for each value of ξ. If μ is held constant at thevalue optimal for the MDL-free case, even low-to-intermediatevalues of ξ can prevent convergence and cause outage.

Now we study adaptation as a function of the number ofmodes D. Fig. 7 shows SER versus ntr achieved by RLS orLMS for different values of D, assuming an SNR of 10.5 dBand no MDL (ξ = 0 dB). As D increases, although the blocklength NFFT decreases slightly, the MIMO equalizer dimensionscales as D2 , so the number of training blocks required forconvergence increases. For RLS, the number of training blocksneeded for convergence is roughly proportional to D (the curvesnearly overlap when scaled by D−1), and the knee and flat-SERregions of the adaptation curves occur at about 6D and 15D,respectively. For LMS, the adaptation curves do not overlapwith any power-of-D scaling. The knees occur at about 20D to30D, while the flat-SER regime starts at about 50D to 70D. Theasymptotic SERs for RLS are around 2 × 10−4 , while those forLMS are a factor of ∼3 higher.

To quantify equalizer adaptation times, we extract from Fig. 7values of ntr such that target SER values are achieved, and com-pute Tadapt using (17). Table V presents these adaptation timesfor RLS and LMS for various numbers of modes D. Adaptationtimes for RLS range from roughly 1 to 7 μs, depending on Dand the target SER, while adaptation times for LMS are roughlyten times longer for the same target SER for a given D. Theseimplications of results are discussed in Section V.

TABLE VVALUES OF ADAPTATION TIME Tadapt TO REACH VARIOUS SYMBOL-ERROR

RATIOS USING RLS AND LMS ALGORITHMS FOR D = 6, 12, 20 AND 30MODES, USING THE RESULTS OF FIG. 6

V. DISCUSSION

As noted above, adaptive MIMO FDEs for MDM have sev-eral important design objectives, including scalability to manymodes, reliable convergence and low asymptotic SERs in thepresence of noise and MDL, fast initial adaptation and subse-quent tracking, high cyclic prefix efficiency, and low computa-tional complexity. We have found that RLS outperforms LMSin most key respects.

As the number of modes D increases, RLS maintains fasteradaptation and lower asymptotic SERs than LMS (see Fig. 7).In the presence of noise, RLS achieves lower asymptotic SERsthan LMS (see Fig. 5). In the presence of MDL, RLS convergesmore reliably and achieves lower asymptotic SERs than LMS(see Fig. 6), which is critical, because experiments [58] havedemonstrated that MDL can readily cause the channel matrix tobecome ill-conditioned.

Reliable tracking of dynamic channel behavior is a criticalrequirement in optical transport networks. It has been estimatedthat MIMO channels in long-haul MDM systems will changeon a time scale of 25 μs [31], similar to long-haul polarization-division-multiplexed systems in SMF. There do not yet existmodels for the dynamic evolution of MDM channels, whichwould enable study of the tracking behavior of adaptive MIMOFDEs. For sake of discussion, we conservatively assume that inadapting to an unknown channel, an FDE must nearly reach itsasymptotic SER in no more than 25 μs, which should ensure itcan track dynamics in a long-haul MDM system.

Using the parameter values in Table V, values of the adapta-tion time Tadapt for the LMS algorithm are close to 25 μs, i.e.,LMS appears fast enough to track long-haul MDM channels.Assuming these parameters, as noted earlier, the cyclic prefixefficiency ηCP is∼80% for all numbers of modes D considered.In Table V, values of Tadapt required for RLS are all at least afactor of ten below 25 μs for all values of D. Recalling (9), therelationship between the block length NFFT and Tadapt , it ispossible to increase NFFT up to a factor of eight. For example,increasing NFFT fourfold (to 213 , 213 , 211 , 211 for D = 6, 12,20, 30) increases the cyclic prefix efficiency ηCP to ∼95% for

Page 9: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

ARIK et al.: ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION IN MODE-DIVISION MULTIPLEXING SYSTEMS 1849

all values of D. In summary, the fast convergence of RLS canbe exploited to improve cyclic prefix efficiency.

Computational complexity is closely related to transceiverpower consumption, making it a key metric for comparing adap-tive algorithms. Referring to Table IV, compared to LMS, RLSrequires 2.0 to 2.7 times more complex multiplications/symboland 1.5 to 2.1 times more complex additions/symbol. As shownin Table V, adaptation times can be up to ten times shorter forRLS than for LMS, so the total computational complexity toadapt to an unknown channel is lower for RLS than for LMS. IfRLS uses a fourfold larger block length NFFT to improve cyclicprefix efficiency ηCP , then the complexity slightly increases butstill remains lower than for LMS. In summary, the fast con-vergence of RLS can be exploited to minimize the complexityrequired to adapt to an unknown channel.

In continuous tracking of a dynamic channel, the higher com-putational complexity of RLS becomes more of a concern.Complexity and power consumption might be minimized byperforming the RLS updates (15) and (16) less frequently (sincethe duration of a block of NFFT symbols is far less than 25 μs),or by employing LMS updates (14). For the first approach, thereis also a possibility to avoid cyclic prefix overhead betweenadaptation intervals by employing overlap-save-based FDE us-ing filter coefficients approximated from the FDE adapted usinga cyclic prefix [47]. Evaluation of such channel tracking meth-ods requires models for channel dynamics. A model for fiberswith two mode groups (D = 6) has been proposed [30]. It is de-sirable to extend such models to larger D and to experimentallyvalidate them.

We have chosen to employ a cyclic prefix instead of us-ing block convolution for reasons stated in Sections II-C andIII. As noted there, while this reduces throughput and transmitpower efficiency by the factor ηCP , it significantly reduces thecomplexity of an adaptive FDE. This approach can be lever-aged by inserting additional pilot symbols to aid in frame syn-chronization [59] or synchronization of carrier frequency [60]or phase [61], improving overall system performance and effi-ciency. The combination of cyclic prefix and pilot symbols isoften called a “unique word”.

In this study, we have neglected the impact of phase noise.There exist several candidate methods for phase noise compen-sation. Their implementation and integration with adaptive FDEare important subjects for further research. Since FDE can bemore prone to degradation from phase error than TDE, it maybe desirable to partially or fully compensate phase noise beforethe adaptive FDE, even if it is necessary to use phase estimatesobtained after the adaptive FDE. MD equalization changes thephase noise statistics, an effect called “equalization-enhancedphase noise” [62], and commonly used symbol-by-symbol time-domain phase recovery methods, such as feedforward carrier re-covery [63], will yield suboptimal performance unless modifiedto account for the modified phase statistics.

A technique popular in wireless systems with FDE is block-by-block phase estimation. With the aid of pilot symbols withina unique word [61], accurate phase estimation at the beginningof each block can be achieved [64]. The estimated phase is usedto compensate the phase error of all symbols within the block

prior to FDE. This approach can be implemented with low com-putational complexity without modifying the FDE adaptationalgorithm. We have simulated this method assuming zero phaseerror at the beginning of each block and assuming the phasenoise is a Wiener process [62] inside the block. We have ob-served negligible degradation in the asymptotic SER of adaptiveFDE, assuming the parameters in Fig. 7 with D = 12 modes,provided the transmitter and local oscillator lasers have a beatlinewidth not exceeding ∼65 kHz, which can be achieved usingcommercial tunable laser modules [65].

As [19], [28] and this paper demonstrate, the GD spread is akey factor in determining the computational complexity, adap-tation speed, and cyclic prefix efficiency of an MDM system.In order to manage the GD spread, as explained in SectionIV-A, we propose to use a GIGDC fiber with low uncoupledGD spread in conjunction with strong mode coupling describedby a section length Lsec = 1 km. This approach is consideredin [20], where it is pointed out that manufacturing process vari-ations may increase the uncoupled GD spread beyond its idealvalue, and that splices between fiber sections may lead to sec-tion lengths Lsec ∼ 5 km. Thus, obtaining the low coupled GDspread assumed here may require intentionally perturbing thefiber in some way analogous to the “spinning” used to reducePMD in SMF [66]. Fiber designs for reduced GD spread, in-cluding methods to enhance mode coupling without increasingloss and MDL, are important topics for future research.

Given the present uncertainty about mode coupling dynam-ics and coupled GD spreads achievable in long-haul fibers, therequired adaptation time Tadapt and achievable impulse re-sponse durations NMD assumed here, and the resulting FDEdesign parameters and performance and efficiency metrics,should be considered as illustrative examples more than precisedeterminations.

An alternate approach to GD management, which has beentermed “GD compensation” reduces end-to-end GD spread byinterconnecting two or more fiber types in which lower- andhigher-order modes exhibit an opposite ordering of GDs [67].This approach may be difficult to scale to several mode groups,where effective reduction of GD spread would require severalfiber types with specific modal GD properties, as pointed outin [68]. Also, in certain cases, mode coupling may limit thereduction of GD spread, because in the strong-coupling regime,end-to-end rms GD spread depends only on the rms GD spreadsof the individual segments [28], and not on the ordering of modalGDs.

VI. CONCLUSION

Long-haul MDM systems must typically use MIMO FDE toachieve sufficiently low computational complexity, in contrastto polarization multiplexing in SMF, where MIMO TDE haslow complexity and is usually employed.

We have studied the LMS and RLS algorithms for adap-tive MIMO FDE. Instead of using block convolution, we in-sert a cyclic prefix. While reducing throughput efficiency, thislowers the computational complexity of adaptation and mayprovide additional functionality, e.g., by facilitating carrier

Page 10: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

1850 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY 15, 2014

TABLE VICOMPUTATIONAL COMPLEXITIES OF ALGORITHM STEPS, WHICH ARE SUMMED TO OBTAIN RESULTS IN TABLE I

synchronization. We have investigated tradeoffs between com-putational complexity, cyclic prefix efficiency, adaptation speedand SER, and the impact of the system GD spread and FFTblock length on these.

We show that using optimized FDE architectures, compu-tational complexities increase sublinearly with the number ofmodes, in contrast to those using TDE. As compared to LMS,RLS achieves faster convergence, higher throughput efficiency,lower output SER, and greater tolerance to mode-dependentloss, but at the cost of somewhat higher complexity per FFTblock. These attributes make RLS preferable for adapting to anunknown channel. For continuous tracking of a dynamic chan-nel, either using RLS at periodic intervals or using LMS con-tinuously might be preferable, depending on channel dynamicsand system requirements.

Our paper illustrates that design of an MDM system to min-imize GD spread is required to enable low computational com-plexity and fast adaptation. This is enabled here by strong modecoupling and by a new family of GIGDC fibers in which uncou-pled GD spread decreases with an increasing number of modes.

APPENDIX A

COMPUTATIONAL COMPLEXITY

In this section, we derive the computational complexities perblock for the update steps (14)–(16) given in Table I. We con-sider the optimal ordering of algebraic operations to minimizecomplexity. For example, multiplication of a matrix, vector and

scalar can be done with lower complexity by first multiplying thematrix and the vector and then the scalar, or by first multiplyingthe vector and the scalar and then the matrix, as opposed to firstmultiplying the scalar and the matrix and then the vector. Wealso assume that any computed result can be reused at multiplesteps without being recomputed. We assume that multiplicationof an M × N matrix by an N × 1 vector requires MN complexmultiplications and M (N − 1) complex additions. We assumethe DFT and inverse DFT operations are performed with radix-2FFT algorithms, so each requires NFFT log2 (NFFT) /2 com-plex multiplications and NFFT log2 (NFFT) complex additionsfor a vector of length NFFT . The number of complex multi-plications and additions for each computed term is shown inTable VI. The results in Table I are obtained by adding them.

REFERENCES

[1] R. J. Essiambre, G. Kramer, P. J. Winzer, G. J. Foschini, and B. Goebel,“Capacity limits of optical fiber networks,” J. Lightw. Technol., vol. 28,no. 4, pp. 662–701, Feb. 2010.

[2] P. J. Winzer and G. J. Foschini, “MIMO capacities and outage probabilitiesin spatially multiplexed optical transport systems,” Opt. Expr., vol. 19,no. 17, pp. 16680–16696, Aug. 2011.

[3] E. Ip, E. Mateo, and W. Ting, “Reduced-complexity nonlinear compensa-tion based on equivalent-span digital backpropagation,” presented at theInt. Conf. Opt. Internet, Yokohoma, Japan, May 2012, Paper TuE.1.

[4] E. Ip, “Nonlinear compensation using backpropagation for polarization-multiplexed transmission,” J. Lightw. Technol., vol. 28, no. 6, pp. 939–951,Mar. 2010.

Page 11: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

ARIK et al.: ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION IN MODE-DIVISION MULTIPLEXING SYSTEMS 1851

[5] T. Morioka, Y. Awaji, R. Ryf, P. J. Winzer, D. Richardson, and F. Poletti,“Enhancing optical communications with brand new fibers,” IEEE Com-mun. Mag., vol. 50, no. 2, pp. s31–s42, Feb. 2012.

[6] S. O. Arik and J. M. Kahn, “Coupled-core multi-core fiber for spatialmultiplexing,” IEEE Photon. Technol. Lett., vol. 25, no. 21, pp. 2054–2057, Nov. 2013.

[7] S. Randel et al., “Mode-multiplexed 6× 20-GBd QPSK transmission over1200-km DGD-compensated few mode fiber,” presented at the Opt. FiberCommun. Conf., Los Angeles, CA, USA, Mar. 2012, Paper PDP5 C.5.

[8] E. Ip, P. Ji, E. Mateo, Y.-K. Huang, L. Xu, D. Qian, N. Bai, and T. Wang,“100G and beyond transmission technologies for evolving optical net-works and relevant physical-layer issues,” Proc. IEEE, vol. 100, no. 5,pp. 1065–1078, May 2012.

[9] R. J. Essiambre and R. W. Tkach, “Capacity trends and limits of opticalcommunication networks,” Proc. IEEE, vol. 100, no. 5, pp. 1035–1055,May 2012.

[10] E. Ip and J. M. Kahn, “Fiber impairment compensation using coherentdetection and digital signal processing,” J. Lightw. Technol., vol. 28, no. 4,pp. 502–519, Feb. 2010.

[11] E. Ip, A. P. T. Lau, D. J. F. Barros, and J. M. Kahn, “Coherent detectionin optical fiber systems,” Opt. Expr., vol. 16, no. 2, pp. 753–791, Jan.2008.

[12] P. Krummrich, E.-D. Schmidt, W. Weiershausen, and A. Mattheus, “Fieldtrial results on statistics of fast polarization changes in long haul WDMtransmission systems,” presented at the Optical Fiber Commun. Conf.,Anaheim, CA, USA, 2005, Paper OThT6.

[13] P. M. Krummrich and K. Kotten, “Extremely fast (microsecond scale)polarization changes in high speed long haul WDM transmission systems,”presented at the Opt. Fiber Commun. Conf., Los Angeles, CA, USA, 2004,session FI3.

[14] S. J. Savory, “Digital coherent optical receivers: Algorithms and subsys-tems,” IEEE J. Sel. Topics Quantum Electron., vol. 16, no. 5, pp. 1164–1179, Oct. 2010.

[15] E. Ip and J. M. Kahn, “Digital equalization of chromatic dispersion andpolarization mode dispersion,” J. Lightw. Technol., vol. 25, no. 8, pp. 2033–2043, Aug. 2007.

[16] J. Renaudier et al., “Linear fiber impairments mitigation of 40-Gbit/spolarization-multiplexed QPSK by digital processing in a coherent re-ceiver,” J. Lightw. Technol., vol. 26, no. 1, pp. 36–42, Jan. 2008.

[17] M. Kuschnerov et al., “DSP for coherent single-carrier receivers,” J.Lightw. Technol., vol. 27, no. 16, pp. 3614–3622, Aug. 2009.

[18] B. Spinnler, “Equalizer design and complexity for digital coherent re-ceivers,” IEEE J. Quantum Electron., vol. 16, no. 5, pp. 1180–1192, Oct.2010.

[19] S. O. Arik, D. Askarov, and J. M. Kahn, “Effect of mode coupling onsignal processing complexity in mode-division multiplexing,” J. Lightw.Technol., vol. 31, no. 3, pp. 423–431, Feb. 2013.

[20] D. Peckham, Y. Sun, A. McCurdy, and R. Lingle, “Few-mode fibertechnology for spatial multiplexing,” in Optical Fiber TelecommunicationsVI, I. P. Kaminow, T. Li, and A. E. Willner, Eds. Amsterdam, TheNetherlands: Elsevier, 2013.

[21] D. Askarov and J. M. Kahn, “Design of transmission fibers and dopedfiber amplifiers for mode-division multiplexing,” IEEE Photon. Technol.Lett., vol. 24, no. 21, pp. 1945–1948, Nov. 2012.

[22] D. Marcuse, “The impulse response of an optical fiber with parabolicindex profile,” Bell Syst. Tech. J., vol. 52, pp. 1169–1174, 1973.

[23] C. R. Fludger et al., “Coherent equalization and POLMUX-RZ-DQPSKfor robust 100-GE transmission,” J. Lightw. Technol., vol. 26, no. 1, pp. 64–72, Jan. 2008.

[24] N. Bai et al., “Mode-division multiplexed transmission with inline fewmode fiber amplifier,” Opt. Expr., vol. 20, no. 3, pp. 2668–2680, Jan.2012.

[25] R. Ryf et al., “Mode-division multiplexing over 96 km of few-mode fiberusing coherent 6 × 6 MIMO processing,” J. Lightw. Technol., vol. 30,no. 4, pp. 521–531, Feb. 2012.

[26] S. Randel et al., “6 × 56-Gb/s mode-division multiplexed transmissionover 33-km few-mode fiber enabled by 6 × 6 MIMO equalization,” Opt.Exp., vol. 19, no. 17, pp. 16697–16707, Aug. 2011.

[27] B. Inan et al., “DSP complexity of mode-division multiplexed receivers,”Opt. Expr., vol. 20, no. 9, pp. 10859–10869, Apr. 2011.

[28] K.-P. Ho and J. M. Kahn, “Mode coupling and its impact on spa-tially multiplexed systems,” in Optical Fiber Telecommunications VI,I. P. Kaminow, T. Li, and A. E. Willner, Eds. Amsterdam: Elsevier,2013.

[29] D. Rafique, S. Stylianos, and A. D. Ellis, “Impact of power allocationstrategies in long-haul few-mode fiber transmission systems,” Opt. Exp.,vol. 21, no. 9, pp. 10801–10809, May 2013.

[30] N. Bai and G. Li, “Adaptive frequency-domain equalization for mode-division multiplexed transmission,” IEEE Photon. Technol. Lett., vol. 24,no. 21, pp. 1918–1921, Nov. 2012.

[31] X. Chen et al., “Characterization of dynamic evolution of channel matrix intwo-mode fibers,” presented at the Opt. Fiber Commun. Conf., Anaheim,CA, USA, Mar. 2013, Paper OM2 C.3.

[32] J. M. Wang and B. Daneshrad, “A comparative study of MIMO detectionalgorithms for wideband spatial multiplexing systems,” in Proc. IEEEConf. Wireless Commun. Netw., Mar. 2005, vol. 1, pp. 408–413.

[33] D. Falconer, S. L. Ariyavisitakul, A. Benyamin-Seeyar, and B. Eidson,“Frequency domain equalization for single-carrier broadband wirelesssystems,” IEEE Commun. Mag., vol. 40, no. 4, pp. 58–66, Apr. 2002.

[34] J. Coon, S. Armour, M. Beach, and J. McGeehan, “Adaptive frequency do-main equalization for single-carrier multiple-input multiple-output wire-less transmissions,” IEEE Trans. Signal Process., vol. 53, no. 8, pp. 3247–3256, Aug. 2005.

[35] M. V. Clark, “Adaptive frequency-domain equalization and diversity com-bining for broadband wireless communications,” IEEE J. Sel. Areas Com-mun., vol. 16, no. 8, pp. 1385–95, Oct. 1998.

[36] S. O. Arik, J. M. Kahn, and K.-P. Ho, “MIMO signal processing in mode-division multiplexing,” IEEE Signal Process. Mag, vol. 31, no. 2, pp. 25–34, Mar. 2014.

[37] R. N. Mahalati, D. Askarov, and J. M. Kahn, “Adaptive control of mode-dependent gain in multi-mode erbium-doped fiber amplifiers,” presentedat the IEEE Summer Topical Space Division Multiplexing Opt. Commun.,Waikoloa, HI, USA, Jul. 2013.

[38] N. Benvenuto, R. Dinis, D. Falconer, and S. Tomasin, “Single-carriermodulation with nonlinear frequency-domain equalization: An idea whosetime has come again,” Proc. IEEE, vol. 98, no. 1, pp. 60–96, Jan. 2010.

[39] F. Pancaldi, G. Vitetta, R. Kalbasi, N. Al Dhahir, M. Uysal, and H. Mheidat,“Single-carrier frequency domain equalization,” IEEE Signal Process.Mag., vol. 25, no. 5, pp. 37–56, Sep. 2008.

[40] S. Randel et al., “MIMO processing for space-division multiplexed trans-mission,” presented at the Adv. Photon. Congr., Colorado Springs, Col-orado, USA, 2012, Paper SpW3B.4.

[41] Y. Jung et al., “First demonstration and detailed characterization of a mul-timode amplifier for space division multiplexed transmission systems,”Opt. Expr., vol. 19, no. 26, pp. B952–B957, Dec. 2011.

[42] D. Askarov and J. M. Kahn, “Design of multi-mode Erbium-doped fiberamplifiers for low mode-dependent gain,” presented at the IEEE SummerTopical Spatial Multiplexing, Seattle, WA, USA, 2012, Paper WC2.2.

[43] K.-P. Ho and J. M. Kahn, “Mode-dependent loss and gain: statisticsand effect on mode-division multiplexing,” Opt. Expr., vol. 19, no. 17,pp. 16612–16635, Aug. 2011.

[44] K.-P. Ho and J. M. Kahn, “Statistics of group delays in multimode fiberwith strong mode coupling,” J. Lightw. Technol., vol. 29, no. 21, pp. 3119–3128, Nov. 2011.

[45] K.-P. Ho and J. M. Kahn, “Delay-spread distribution for multimode fiberwith strong mode coupling,” IEEE Photon. Technol. Lett., vol. 24, no. 21,pp. 1906–1909, Nov. 2012.

[46] S. Haykin, Adaptive Filter Theory, 4th ed. Upper Saddle River, NJ,USA: Prentice-Hall, 2002.

[47] D. Falconer and S. L. Ariyavisitakul, “Broadband wireless using singlecarrier and frequency domain equalization,” in Int. Conf. Wireless PersonalMultimedia Commun., Honolulu, HI, USA, Oct. 2002, pp. 27–36.

[48] D. J. F. Barros and J. M. Kahn, “Optimized dispersion compensation usingorthogonal frequency-division multiplexing,” J. Lightw. Technol., vol. 26,no. 16, pp. 2889–2898, Aug. 2008.

[49] J. C. Geyer, C. Fludger, T. Duthel, C. Schulien, and B. Schaauss, “Effi-cient frequency domain chromatic dispersion compensation in a coherentPolmux QPSK-receiver,” presented at the Opt. Fiber Commun. Conf., SanDiego, CA, USA, 2010, Paper OWV5.

[50] G. Goldfarb and G. Li, “Chromatic dispersion compensation using digitalIIR filtering with coherent detection,” IEEE Photon. Technol. Lett., vol. 19,no. 13, pp. 969–971, Jul. 2007.

[51] K. P. Ho, “Subband equaliser for chromatic dispersion of optical fibre,”Electron. Lett., vol. 45, no. 24, pp. 1224–1226, Nov. 2009.

[52] B. Li, Q. Wang, G. Lu, Y. Chang, and D. Yang, “Linear MMSE frequencydomain equalization with colored noise,” in Proc. IEEE VTC, Baltimore,USA, Sep. 2007, pp. 1152–1156.

[53] J. G. Proakis, Digital Communications, 3rd ed. New York, NY, USA:McGrawHill, 1995.

Page 12: JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY …

1852 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 32, NO. 10, MAY 15, 2014

[54] D. Donlagic, “Opportunities to enhance multimode fiber links by applica-tion of overfilled launch,” J. Lightw. Technol., vol. 23, no. 11, pp. 3526–3540, Nov. 2005.

[55] K. Okamoto and T. Okoshi, “Computer-aided synthesis of the optimumrefractive-index profile for a multimode fiber,” IEEE Trans. Microw. The-ory Tech., vol. MTT-25, no. 3, pp. 213–221, Mar. 1977.

[56] T. Ishigure, H. Endo, K. Ohdoko, and Y. Koike, “High-bandwidth plasticoptical fiber with W-refractive index profile,” IEEE Photon. Technol. Lett.,vol. 16, no. 9, pp. 2081–2083, Sep. 2004.

[57] R. Feced, S. J. Savory, and A. Hadjifotiou, “Interaction between polariza-tion mode dispersion and polarization dependent losses in optical commu-nication links,” J. Opt. Soc. Amer. B, Opt. Phys., vol. 20, no. 3, pp. 424–433,Mar. 2003.

[58] N. Cvijetic, E. Ip, N. Prasad, and M. Li, “Experimental frequency-domainchannel matrix characterization for SDM-MIMO-OFDM systems,” pre-sented at the IEEE Summer Topical Space Division Multiplexing Opt.Commun., Waikoloa, HI, USA, Jul. 2013, Paper WC4.3.

[59] L. Deneire, B. Gyselinckx, and M. Engels, “Training sequence versuscyclic prefix-a new look on single carrier communication,” IEEE Commun.Lett., vol. 5, no. 7, pp. 292–294, Jul. 2001.

[60] M. Huemer, H. Witschnig, and J. Hausner, “Unique word based phasetracking algorithms for SC/FDE-systems,” in Proc. IEEE Global Telecom-mun. Conf., Dec. 2003, vol. 1, pp. 70–74.

[61] M. Asim, M. Ghogho, and D. McLemon, “Mitigation of phase noise insingle carrier frequency domain equalization systems,” in Proc. WirelessCommun. Netw. Conf., Apr. 2012, pp. 920–924.

[62] K.-P. Ho and W. Shieh, “Equalization-enhanced phase noise in mode-division multiplexed Systems,” J. Lightw. Technol., vol. 31, no. 13,pp. 2237–2243, Jul. 2013.

[63] E. Ip and J. M. Kahn, “Feedforward carrier recovery for coherent opticalcommunications,” J. Lightw. Technol., vol. 25, no. 9, pp. 2675–2692, Sep.2007.

[64] M. Sabbaghian and D. Falconer, “Joint turbo frequency domain equaliza-tion and carrier synchronization,” IEEE Trans. Wireless Commun., vol. 7,no. 1, pp. 204–212, Jan. 2008.

[65] E. Ip, J. M. Kahn, D. Anthon, and J. Hutchins, “Linewidth measurementsof MEMS-based tunable lasers for phase-locking applications,” IEEEPhoton. Technol. Lett., vol. 17, no. 10, pp. 2029–2031, Oct. 2005.

[66] D. A. Nolan and M. J. Li, “Fiber spin-profile designs for producingfibers with low polarization mode dispersion,” Opt. Lett., vol. 23, no. 21,pp. 1659–1661, Nov. 1998.

[67] R. Ryf et al., “Low-loss mode coupler for mode-multiplexed transmissionin few-mode fiber,” presented at the Proc. Opt. Fiber Commun. Conf., LosAngeles, CA, USA, Mar. 2012, Paper PDP5B.5.

[68] R. Ryf et al., “12 × 12 MIMO transmission over 130-km few-modefiber,” presented at the Proc. Frontiers Opt. Conf., Rochester, NY, USA,Oct. 2012, Paper FW6 C.4.

Sercan O. Arık received the B.S. degree from Bilkent University, Ankara,Turkey, in 2011, and the M.S. degree from Stanford University in electrical en-gineering in 2013. He is currently working towards the Ph.D. degree at StanfordUniversity. He worked at EPFL Signal Processing Labs (Lausanne, Switzer-land) in summer 2010, at Google (Mountain View, CA, USA) in summer 2012and at Mitsubishi Electric Research Labs (Cambridge, MA, USA) in summer2013. His main current research interests include multi-mode optical communi-cations; advanced modulation, coding and digital signal processing techniquesfor fiber-optic systems; and optical signal processing.

Daulet Askarov received the B.S. degree in applied mathematics and physicsfrom Moscow Institute of Physics and Technology in 2009, and the M.S. de-gree in electrical engineering from Stanford University in 2011, where he iscurrently working toward the Ph.D. degree in electrical engineering. His currentresearch interests include various topics in multi-mode fiber communications.

Joseph M. Kahn (M’90–SM’98–F’00) received the A.B., M.A., and Ph.D.degrees in physics from U.C. Berkeley in 1981, 1983, and 1986, respectively.From 1987–1990, he was at AT&T Bell Laboratories, Crawford Hill Labo-ratory, in Holmdel, NJ, USA. He demonstrated multi-Gbit/s coherent opticalfiber transmission systems, setting world records for receiver sensitivity. From1990–2003, he was with the Faculty of the Department of Electrical Engineeringand Computer Sciences at U.C. Berkeley, performing research on optical andwireless communications. Since 2003, he has been a Professor of Electrical En-gineering at Stanford University, where he heads the Optical CommunicationsGroup. His current research interests include fiber-based imaging, spatial multi-plexing, rate-adaptive and spectrally efficient modulation and coding methods,coherent detection and associated digital signal processing algorithms, digitalcompensation of fiber nonlinearity, and free-space systems. He received the Na-tional Science Foundation Presidential Young Investigator Award in 1991. From1993–2000, he served as a technical editor of IEEE Personal CommunicationsMagazine. Since 2009, he has been an Associate Editor of IEEE/OSA Journalof Optical Communications and Networking. In 2000, he helped found Strata-Light Communications, where he served as Chief Scientist from 2000–2003.StrataLight was acquired by Opnext, Inc., in 2009.