Mutual Information LLR

Embed Size (px)

Citation preview

  • 8/19/2019 Mutual Information LLR

    1/239

    EURASIP Journal on Applied Signal Processing

    Turbo Processing

    Guest Editors: Luc Vandendorpe, Alex M. Haimovich,and Ramesh Pyndiah

  • 8/19/2019 Mutual Information LLR

    2/239

    Turbo Processing

    EURASIP Journal on Applied Signal Processing

  • 8/19/2019 Mutual Information LLR

    3/239

    Turbo Processing 

    Guest Editors: Luc Vandendorpe, Alex M. Haimovich,

    and Ramesh Pyndiah

    EURASIP Journal on Applied Signal Processing

  • 8/19/2019 Mutual Information LLR

    4/239

    Copyright © 2005 Hindawi Publishing Corporation. All rights reserved.

    This is a special issue published in volume 2005 of “EURASIP Journal on Applied Signal Processing.” All articles are open accessarticles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproductionin any medium, provided the original work is properly cited.

  • 8/19/2019 Mutual Information LLR

    5/239

    Editor-in-Chief Marc Moonen, Belgium

    Senior Advisory EditorK. J. Ray Liu, College Park, USA

    Associate EditorsGonzalo Arce, USA Arden Huang, USA Douglas O’Shaughnessy, CanadaJaakko Astola, Finland Jiri Jan, Czech Antonio Ortega, USAKenneth Barner, USA Søren Holdt Jensen, Denmark Montse Pardas, SpainMauro Barni, Italy Mark Kahrs, USA Wilfried Philips, Belgium

    Jacob Benesty, Canada Thomas Kaiser, Germany Vincent Poor, USAKostas Berberidis, Greece Moon Gi Kang, Korea Phillip Regalia, FranceHelmut Bölcskei, Switzerland Aggelos Katsaggelos, USA Markus Rupp, AustriaJoe Chen, USA Walter Kellermann, Germany Hideaki Sakai, JapanChong-Yung Chi, Taiwan Alex Kot, Singapore Bill Sandham, UKSatya Dharanipragada, USA C.-C. Jay Kuo, USA Dirk Slock, FrancePetar M. Djurić, USA Geert Leus, The Netherlands Piet Sommen, The NetherlandsJean-Luc Dugelay, France Bernard C. Levy, USA Dimitrios Tzovaras, GreeceFrank Ehlers, Germany Mark Liao, Taiwan Hugo Van hamme, BelgiumMoncef Gabbouj, Finland Yuan-Pei Lin, Taiwan Jacques Verly, BelgiumSharon Gannot, Israel Shoji Makino, Japan Xiaodong Wang, USAFulvio Gini, Italy Stephen Marshall, UK Douglas Williams, USAA. Gorokhov, The Netherlands C. Mecklenbräuker, Austria Roger Woods, UK

    Peter Handel, Sweden Gloria Menegaz, Italy Jar-Ferr Yang, TaiwanUlrich Heute, Germany Bernie Mulgrew, UKJohn Homer, Australia King N. Ngan, Hong Kong

  • 8/19/2019 Mutual Information LLR

    6/239

    Contents

    Tribute for Professor Alain Glavieux , Ramesh Pyndiah, Alex M. Haimovich, and Luc VandendorpeVolume 2005 (2005), Issue 6, Pages 757-757

    Editorial, Luc Vandendorpe, Alex M. Haimovich, and Ramesh PyndiahVolume 2005 (2005), Issue 6, Pages 759-761

    Iterative Decoding of Concatenated Codes: A Tutorial, Phillip A. RegaliaVolume 2005 (2005), Issue 6, Pages 762-774

    Parallel and Serial Concatenated Single Parity Check Product Codes, David M. Rankin,T. Aaron Gulliver, and Desmond P. TaylorVolume 2005 (2005), Issue 6, Pages 775-783

    On Rate-Compatible Punctured Turbo Codes Design, Fulvio Babich, Guido Montorsi,and Francesca VattaVolume 2005 (2005), Issue 6, Pages 784-794

    Convergence Analysis of Turbo Decoding of Serially Concatenated Block Codes and Product Codes,Amir Krause, Assaf Sella, and Yair Be'ery Volume 2005 (2005), Issue 6, Pages 795-807

    Design of Three-Dimensional Multiple Slice Turbo Codes, David Gnaedig, Emmanuel Boutillon,and Michel Jézéquel

    Volume 2005 (2005), Issue 6, Pages 808-819

    Improved Max-Log-MAP Turbo Decoding by Maximization of Mutual Information Transfer,Holger Claussen, Hamid Reza Karimi, and Bernard Mulgrew Volume 2005 (2005), Issue 6, Pages 820-827

    Trellis-Based Iterative Adaptive Blind Sequence Estimation for Uncoded/Coded Systems withDifferential Precoding , Xiao-Ming Chen and Peter A. HoeherVolume 2005 (2005), Issue 6, Pages 828-843

    System Performance of Concatenated STBC and Block Turbo Codes in Dispersive Fading Channels,Yinggang Du and Kam Tai Chan

    Volume 2005 (2005), Issue 6, Pages 844-851

    Turbo-per-Tone Equalization for ADSL Systems, Hilde Vanhaute and Marc MoonenVolume 2005 (2005), Issue 6, Pages 852-860

    Super-Orthogonal Space-Time Turbo Transmit Diversity for CDMA, Daniël J. van Wyk,Louis P. Linde, and Pieter G. W. van RooyenVolume 2005 (2005), Issue 6, Pages 861-871

    Iterative PDF Estimation-Based Multiuser Diversity Detection and Channel Estimation withUnknown Interference, Nenad Veselinovic, Tad Matsumoto, and Markku JunttiVolume 2005 (2005), Issue 6, Pages 872-882

  • 8/19/2019 Mutual Information LLR

    7/239

    An Iterative Multiuser Detector for Turbo-Coded DS-CDMA Systems, Emmanuel Oluremi Bejideand Fambirai TakawiraVolume 2005 (2005), Issue 6, Pages 883-891

    Performance Evaluation of Linear Turbo-Receivers Using Analytical Extrinsic Information TransferFunctions, César Hermosilla and Leszek SzczecińskiVolume 2005 (2005), Issue 6, Pages 892-905

    Joint Source-Channel Decoding of Variable-Length Codes with Soft Information: A Survey ,Christine Guillemot and Pierre SiohanVolume 2005 (2005), Issue 6, Pages 906-927

    Iterative Source-Channel Decoding: Improved System Design Using EXIT Charts, Marc Adratand Peter Vary Volume 2005 (2005), Issue 6, Pages 928-941

    LDGM Codes for Channel Coding and Joint Source-Channel Coding of Correlated Sources,Wei Zhong and Javier Garcia-FriasVolume 2005 (2005), Issue 6, Pages 942-953

    Iterative List Decoding of Concatenated Source-Channel Codes, Ahmadreza Hedayatand Aria NosratiniaVolume 2005 (2005), Issue 6, Pages 954-960

    An Efficient SF-ISF Approach for the Slepian-Wolf Source Coding Problem, Zhenyu Tu,Jing Li (Tiffany), and Rick S. BlumVolume 2005 (2005), Issue 6, Pages 961-971

    Carrier and Clock Recovery in (Turbo-)Coded Systems: Cramér-Rao Bound and SynchronizerPerformance, N. Noels, H. Steendam, and M. Moeneclaey Volume 2005 (2005), Issue 6, Pages 972-980

    Iterative Code-Aided ML Phase Estimation and Phase Ambiguity Resolution, Henk Wymeerschand Marc Moeneclaey Volume 2005 (2005), Issue 6, Pages 981-988

  • 8/19/2019 Mutual Information LLR

    8/239

    EURASIP Journal on Applied Signal Processing 2005:6, 757–757

    c 2005 Hindawi Publishing Corporation

    Tribute for Professor Alain Glavieux

    We dedicate this special issue on “Turbo Processing” of the EURASIP Journal on Applied Signal Pro-cessing to Professor Alain Glavieux who passed away on September 25th, 2004, at the age of 55. Aftergraduating from ENST Paris, he joined ENST Bretagne in 1978 where he set up from scratch theteaching program in digital communications. In the mid 80s, he set up the Signal & CommunicationResearch Laboratory at ENST Bretagne, before being promoted to Director of Industrial Relations in1998 and Deputy Director in 2003. He created the TAMCIC Laboratory affiliated to the CNRS (UMR 2872) in 2002 and was the Director. He chaired the First International Symposium on Turbo Codesin Brest in 1997 and was involved in the organization of the International Conference on Communi-

    cations in Paris in 2004 where he served on the executive committee.Among his numerous achievements, the most famousone will certainly be the invention of “Turbo

    Codes” with his colleague C. Berrou, in the early 90s. This sparked enormous research activities world-wide and this special issue is a typical illustration of the results of these activities. He and his col-league received many distinctions, among which the prestigious IEEE Hamming Medal in 2003. AlainGlavieux was also an exceptional teacher and those who attended his lectures keep a very pleasantimpression engraved in their memory. Beyond his excellent scientific capabilities, his pleasant per-sonality, patience, and generosity contributed a lot to his excellent image within the community. Hewill always be remembered for his kindness and dedication to the well-being of all those around him.

    We express our deep sympathy to his mother, his wife Marie-Louise, his daughter Christelle, hisgrandchildren, and his relatives. Good-bye to you Alain, we all miss you a lot.

    Ramesh Pyndiah Alex M. Haimovich

    Luc Vandendorpe

  • 8/19/2019 Mutual Information LLR

    9/239

  • 8/19/2019 Mutual Information LLR

    10/239

    EURASIP Journal on Applied Signal Processing 2005:6, 759–761

    c 2005 Hindawi Publishing Corporation

    Editorial

    LucVandendorpe

    Laboratoire de T ́el ́ecommunications et T ́el ́ed ́etection, Facult ́e des Sciences Appliquées, Universit ́e catholique de Louvain,1348 Louvain-la-Neuve, BelgiumEmail: [email protected]

    AlexM.Haimovich

     New Jersey Center for Wireless Communications, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USAEmail: [email protected]

    Ramesh Pyndiah

    Département Signal et Communications, École Nationale Supérieure des T ́el ́ecommunications de Bretagne, Technopôle de Brest Iroise,CS 83818, 29238 Brest Cedex, FranceEmail: [email protected] 

    The turbo codes appeared in the early 90s. While the ideaof iterative/turbo processing was first applied to decoding,the idea quite rapidly “gained” other blocks of the commu-nication chain, leading to the nowadays well-known “turboprinciple.”

    When coded information is interleaved and gets trans-mitted over a channel with interference (intersymbol, inter-antenna, interuser, and combinations thereof), joint detec-tion/decoding can be achieved, named turbo (joint) detec-tion.

    Yet another application of this principle is the exploita-tion of the residual information available at the output of asource coder. The exploitation of this redundancy, togetherwith decoding, leads to joint source/channel decoding.

    Finally, there have also been attempts to make the syn-chronization units benefit from the soft information deliv-ered by the decoder. These approaches are called turbosyn-chronisation.

    The first group of papers deals with turbo codes and waysto improve their performance.

    In “Iterative decoding of concatenated codes: A tutorial,”P. A. Regalia gives a tutorial on iterative decoding presentedas a tractable method to approach ML decoding and viewedas an alternating projection algorithm.

    D. M. Rankin et al. in “Parallel and serial concatenatedsingle parity-check product codes” provide bounds and sim-ulation results on the performance of parallel and serially concatenated single parity-check product codes as compo-nent codes. These codes provide a good tradeoff  betweencomplexity and performance.

    In “On rate-compatible punctured turbo codes design,”F. Babich et al. give a low-complexity method to optimizethe puncturing pattern for rate-compatible punctured turbocodes. BER simulation results are provided for puncturingpatterns designed with this method and compared to the cor-

    responding transfer function bound results.In “Convergence analysis of turbo decoding of serially concatenated block codes and product codes,” authored by A. Krause et al., the stability of iterative decoding of serially concatenated codes where the extrinsic information on par-ity check bits are passed on from one decoder to the other isanalyzed. The authors show that in some cases, the restrain-ing factor on the extrinsic is vital to guarantee the stability of the iterative decoding process. Results of the stability analysisare confirmed by simulation results.

    In “Design of three-dimensional multiple slice turbocodes,” D. Gnaedig et al. extend an idea they suggestedin an earlier publication of introducing parallelism in theturbo decoding. They apply this parallel implementation toa turbo code architecture with three component encoders.They show that this approach leads to lower the hardwarecomplexity and higher the performance in terms of a lowererror floor.

    In “Improved Max-Log-MAP turbo decoding by maxi-mization of mutual information transfer,” H. Claussen et al.suggest to improve the performance of a turbo decoder by maximizing the transfer of mutual information between thecomponent decoders. The improvement in performance isachieved by using optimized iteration-dependent correctionweights to scale the a priori information at the input of eachcomponent decoder.

  • 8/19/2019 Mutual Information LLR

    11/239

    760 EURASIP Journal on Applied Signal Processing

    A diff erent approach to reducing complexity of turbo de-coding is taken by X.-M. Chen and P. A. Hoeher in “Trellis-based iterative adaptive blind sequence estimation foruncoded/coded systems with diff erential precoding,” wherethe authors develop iterative, adaptive trellis-based blind se-quence estimators based on joint maximum-likelihood (ML)

    data/channel estimation. The number of states in the trellisserves as a design parameter, providing a tradeoff  betweenperformance and complexity.

    The application of turbo codes to space-time coding isinvestigated in “System performance of concatenated STBCand block turbo codes in dispersive fading channels” by Y.Du and K. T. Chan. The authors demonstrate that the con-catenation of a block turbo code and a space-time turbo codeconfers on the combined code both high coding gain and di-versity gain.

    The second group of papers is related to the general topicof turbo detection.

    The application of turbo coding to equalization is studied

    by H. Vanhaute and M. Moonen in “Turbo-per-tone equal-ization for ADSL systems.” Here, the authors propose anddemonstrate the benefits of a frequency-domain turbo equal-izer.

    D. J. van Wyk et al. in “Super-orthogonal space-timeturbo transmit diversity for CDMA” investigate the con-cept of layered super-orthogonal turbo-transmit diversity (SOTTD) for downlink DS-CDMA systems using multipletransmit and single receive antennas. Theoretical and sim-ulation results show that this scheme outperforms classicalcode-division transmit diversity using turbo codes.

    In “Iterative PDF estimation-based multiuser diversity detection and channel estimation with unknown interfer-ence,” N. Veselinovic et al. propose a kernel smoothing PDFestimation of unknown cochannel interference to improvemultiuser MMSE detectors with multiple receive antennas.This estimation can be performed using training symbolsand can also be improved using feedback from channel de-coder. Simulation results are provided on frequency-selectivechannels.

    The paper “An iterative multiuser detector for turbo-coded DS-CDMA systems,” by E. O. Bejide and F. Takawira,proposes an iterative multiuser detector for turbo-coded syn-chronous and asynchronous DS-CDMA systems. The ap-proach proposed here is to estimate the multiple-access in-terference but instead of performing (soft) interference can-cellation, the estimated interference is used as added infor-mation in the MAP estimation of the bit of interest.

    C. Hermosilla and L. Szczeciński in “Performance evalu-ation of linear turbo receivers using analytical extrinsic infor-mation transfer functions” investigate the performance anal- ysis of turbo receivers with a linear front end. The methodis based on EXIT charts obtained using only available chan-nel state information and is hence called analytical. At eachiteration, the BER can be obtained.

    The third group of papers is devoted to the use of theturbo principle to perform source decoding.

    The paper “Joint source-channel decoding of variable-length codes with soft information: A survey,” written by 

    C. Guillemot and P. Siohan, is an overview paper about the joint source-channel decoding of variable-length codes withsoft information. Recent theoretical and practical advancesin this area are reported.

    Turbo joint source-channel decoding is considered in “It-erative source-channel decoding: Improved system design

    using EXIT charts” by M. Adrat and P. Vary. The EXITchart representation is used to improve the error correct-ing/concealing capabilities of iterative source-channel decod-ing schemes. New design guidelines are proposed to selectappropriate bit mappings and to design the channel codingcomponent.

    In “LDGM codes for channel coding and joint source-channel coding of correlated sources,” W. Zhong and J.Garcia-Frias propose to use low-density generator matri-ces (LDGM) codes. These codes off er a complexity advan-tage thanks to the sparseness of the encoding matrix. They are considered for the purpose of coding over a variety of channels, and for joint source-channel coding of correlated

    sources.The paper “Iterative list decoding of concatenatedsource-channel codes” by A. Hedayat and A. Nosratinia fo-cuses on the use of residual redundancy of variable-lengthcodes for joint source-channel decoding. Improvement is ob-tained by using iterative list decoding, made possible thanksto a nonbinary outer CRC code.

    Z. Tu et al. describe an efficient method to build syn-drome former and inverse syndrome former for parallel andserially concatenated convolutional codes in “An efficient SF-ISF approach for the Slepian-Wolf source coding problem.”This opens the way to the use of powerful turbo codes de-signed for forward-error correction for solving the Slepian-Wolf source coding problem. Simulation results show com-pression rates very close to the theoretical limit.

    The final group of papers is related to the topic of softinformation-driven parameter estimation.

    As many coded systems operate at very low signal-to-noise ratios, synchronization is a difficult task. The theo-retical aspects of the synchronization problem are studiedin “Carrier and clock recovery in (turbo-) coded systems:Cramér-Rao bound and synchronizer performance” by N.Noels et al., where the Cramér-Rao bound (CRB) for jointcarrier phase, carrier frequency, and timing estimation is de-rived from a noisy linearly modulated signal with encodeddata symbols. On the practical side, H. Wymeersch and M.Moeneclaey in “Iterative code-aided ML phase estimationand phase ambiguity resolution” propose several iterative MLalgorithms for joint carrier phase estimation and ambiguity resolution.

    We wish all the readers a very exciting “special issue” thatwe believe is highly representative of the diff erent trends cur-rently observed in this research area.

    Luc Vandendorpe Alex M. Haimovich

    Ramesh Pyndiah

  • 8/19/2019 Mutual Information LLR

    12/239

    Editorial 761

    Luc Vandendorpe  was born in Mouscron,Belgium, in 1962. He received the Electri-cal Engineering degree (summa cum laude)and the Ph.D. degree from the Universitécatholique de Louvain (UCL), Louvain-la-Neuve, Belgium, in 1985 and 1991, respec-tively. Since 1985, L. Vandendorpe is with

    the Communications and Remote SensingLaboratory of UCL. In 1992, he was a Re-search Fellow at the Delft Technical Univer-sity. From 1992 to 1997, L. Vandendorpe was a Senior ResearchAssociate of the Belgian NSF at UCL. Presently, he is a Professor.He is mainly interested in digital communication systems: equal-ization, joint detection/synchronization for CDMA, OFDM (mul-ticarrier), MIMO and turbo-based communications systems, and

     joint source/channel (de)coding. In 1990, he was corecipient of theBiennal Alcatel-Bell Award. In 2000, he was corecipient of the Bi-ennal Siemen. L. Vandendorpe is or has been a TPC Member forIEEE VTC Fall 1999, IEEE Globecom 2003 Communications The-ory Symposium, the 2003 Turbo Symposium, IEEE VTC Fall 2003,and IEEE SPAWC 2005. He is Cotechnical Chair (with P. Duhamel)

    for IEEE ICASSP 2006. He is an Associate Editor of the IEEE Trans-actions on Wireless Communications, Associate Editor of the IEEETransactions on Signal Processing, and a Member of the Signal Pro-cessing Committee for Communications.

    Alex M. Haimovich   is a Professor of elec-trical and computer engineering at the New Jersey Institute of Technology (NJIT). Herecently served as the Director of the New Jersey Center for Wireless Telecommunica-tions, a state-funded consortium consistingof NJIT, Princeton University, Rutgers Uni-versity, and Stevens Institute of Technology.He has been at NJIT since 1992. Prior tothat, he served as the Chief Scientist of JJM

    Systems from 1990 until 1992. From 1983 till 1990, he worked in avariety of capacities, up to Senior Staff  Consultant, for AEL Indus-tries. He received the Ph.D. degree in systems from the University of Pennsylvania in 1989, the M.S. degree in electrical engineeringfrom Drexel University in 1983, and the B.S. degree in electricalengineering from the Technion, Israel, in 1977. His research inter-ests include MIMO systems, array processing for wireless, turbo-coding, space-time coding, and ultra-wideband systems, and radar.He recently served as a Chair of the Communication Theory Sym-posium at Globecom 2003. He is currently an Associate Editor forthe IEEE Communications Letters.

    Ramesh Pyndiah  was qualified as an Elec-tronics Engineer from “ENST Bretagne”

    in 1985. In 1994, he received his Ph.D.degree in electronics engineering from“l’Université de Bretagne Occidentale” andin 1999, his HDR (Habilitation  à Dirigerdes Recherches) from “Universit de RennesI.” From 1985 to 1990, he was a Senior Re-search Engineer at the Philips Research Lab-oratory (LEP) in France where he was in-volved in the design of monolithic microwave integrated circuits(MMIC) for digital radio links. In October 1991, he joined the Sig-nal & Communications Department of “ENST Bretagne,” wherehe developed the concept of block turbo codes. Since 1998, he isthe Head of the Signal & Communications Department. He haspublished more than fifty papers and holds more than ten patents.

    His current research interests are modulation, channel coding(turbo codes), joint source-channel coding, space-division mul-tiplexing, and space-time coding. He received the Blondel Medalfrom SEE, France, in 2001. He is a Senior Member of IEEE and theIEEE ComSoc France Chapter Chair since 2001. He has been in-volved in several TPC conferences (Globecom, ICC, ISTC, ECWT,etc.) and was on the executive organization committee of ICC 2004

    in Paris.

  • 8/19/2019 Mutual Information LLR

    13/239

    EURASIP Journal on Applied Signal Processing 2005:6, 762–774

    c 2005 Phillip A. Regalia

    Iterative Decoding of Concatenated Codes: A Tutorial

    Phillip A. Regalia

    Département Communications, Images et Traitement de l’ Information, Institut National des T ́el ́ecommunications,91011 Evry Cedex, France

    Department of Electrical Engineering and Computer Science, Catholic University of America, Washington, DC 20064, USAEmail: [email protected] 

    Received 29 September 2003; Revised 1 June 2004

    The turbo decoding algorithm of a decade ago constituted a milestone in error-correction coding for digital communications, andhas inspired extensions to generalized receiver topologies, including turbo equalization, turbo synchronization, and turbo CDMA,among others. Despite an accrued understanding of iterative decoding over the years, the “turbo principle” remains elusive tomaster analytically, thereby inciting interest from researchers outside the communications domain. In this spirit, we develop atutorial presentation of iterative decoding for parallel and serial concatenated codes, in terms hopefully accessible to a broaderaudience. We motivate iterative decoding as a computationally tractable attempt to approach maximum-likelihood decoding, andcharacterize fixed points in terms of a “consensus” property between constituent decoders. We review how the decoding algorithmfor both parallel and serial concatenated codes coincides with an alternating projection algorithm, which allows one to identify conditions under which the algorithm indeed converges to a maximum-likelihood solution, in terms of particular likelihoodfunctions factoring into the product of their marginals. The presentation emphasizes a common framework applicable to bothparallel and serial concatenated codes.

    Keywords and phrases: iterative decoding, maximum-likelihood decoding, information geometry, belief propagation.

    1. INTRODUCTION

    The advent of the turbo decoding algorithm for parallel con-catenated codes a decade ago [1] ranks among the most sig-nificant breakthroughs in modern communications in thepast half century: a coding and decoding procedure of rea-sonable computational complexity was finally at hand off er-ing performance approaching the previously elusive Shan-non limit, which predicts reliable communications for allchannel capacity rates slightly in excess of the source entropy rate. The practical success of the iterative turbo decoding al-gorithm has inspired its adaptation to other code classes, no-tably serially concatenated codes [2, 3], and has rekindled in-terest [4, 5] in low-density parity-check codes [6], which give

    the definitive historical precedent in iterative decoding.The serial concatenated configuration holds particularinterest for communication systems, since the “inner en-coder” of such a configuration can be given more generalinterpretations, such as a “parasitic” encoder induced by aconvolutional channel or by the spreading codes used inCDMA. The corresponding iterative decoding algorithm canthen be extended into new arenas, giving rise to turbo equal-

    This is an open access article distributed under the Creative CommonsAttribution License, which permits unrestricted use, distribution, andreproduction in any medium, provided the original work is properly cited.

    ization [7,   8,   9] or turbo CDMA [10,   11], among doubt-

    less other possibilities. Such applications demonstrate thepower of iterative techniques which aim to jointly opti-mize receiver components, compared to the traditional ap-proach of adapting such components independently of oneanother.

    The turbo decoding algorithm for error-correction codesis known not to converge, in general, to a maximum-likelihood solution, although in practice it is usually ob-served to give comparable performance [12,   13,   14]. Thequest to understand the convergence behavior has spawnednumerous inroads, including extrinsic information trans-fer (or EXIT) charts [15], density evolution of intermediatequantities [16, 17], phase trajectory techniques [18], Gaus-

    sian approximations which simplify the analysis [19], andcross-entropy minimization [20], to name a few. Some of these analysis techniques have been applied with success toother configurations, such as turbo equalization [21,   22].Connections to the belief propagation algorithm [23] havealso been identified [24], which approach in turn is closely linked to earlier work on graph theoretic methods [25, 26,27, 28]. In this context, the turbo decoding algorithm givesrise to a directed graph having cycles; the belief propagationalgorithm is known to converge provided no cycles appear inthe directed graph, although less can be said in general oncecycles appear.

  • 8/19/2019 Mutual Information LLR

    14/239

    Iterative Decoding of Concatenated Codes: A Tutorial 763

    Interest in turbo decoding and related topics now ex-tends beyond the communications community, and has beenmet with useful insights from other fields; some referencesin this direction include [29] which draws on nonlinear sys-tem analysis, [30] which draws on computer science, in ad-dition to [31] (predating turbo codes) and [32] (more re-

    cent) which inject ideas from statistical physics, which in turncan be rephrased in terms of information geometry [33, 34].Despite this impressive pedigree of analysis techniques, the“turbo principle” remains difficult to master analytically and,given its fair share of specialized terminology if not a certaindegree of mystique, is often perceived as difficult to grasp tothe nonspecialist. In this spirit, the aim of this paper is to pro-vide a reasonably self-contained and tutorial development of iterative decoding for parallel and serial concatenated codes,in terms hopefully accessible to a broader audience. The pa-per does not aim at a comprehensive survey of available anal- ysis techniques and implementation tricks surrounding it-erative decoding (for which the texts [12, 13, 14] would be

    more appropriate), but rather chooses a particular vantagepoint which steers clear of unnecessary sophistication andavoids approximations.

    We begin in Section 2 by reviewing optimum (maximuma posteriori and maximum-likelihood) decoding of parallelconcatenated codes. We motivate the turbo decoding algo-rithm as a computationally tractable attempt to approachmaximum-likelihood decoding. A characterization of fixedpoints is obtained in terms of a “consensus” property be-tween the two constituent decoders, and a simple proof of the existence of fixed points is obtained as an application of the Brouwer fixed point theorem.

    Section 3   then reexamines the calculation of marginaldistributions in terms of a projection operator, leading to acompact formulation of the turbo decoding algorithm as analternating projection algorithm. The material of the sectionaims at a concrete transcription of ideas originally developedby Richardson [29]; we include in addition a minimum-distance property of the projector in terms of the Kullback-Leibler divergence, and review how the turbo decoding algo-rithm indeed converges to a maximum-likelihood solutionwhenever specific likelihood functions factor into the prod-uct of its marginals. The factorization is known [18] to holdin extreme signal-to-noise ratios.

    Section 4 shows that the iterative decoding algorithm forserial concatenated codes also admits an alternating pro- jection interpretation, allowing us to transcribe all resultsfor parallel concatenated codes to their serial concatenatedcounterparts. This should also facilitate unified studies of both code classes. Concluding remarks are summarized inSection 5.

    2. TURBO DECODING OF PARALLELCONCATENATED CODES

    We begin by reviewing the classical turbo decoding algorithmfor parallel concatenated codes. For simplicity, we restrict ourdevelopment to the binary signaling case; the  m-ary case can

    Systematic

    encoder 1  (ξ 1, . . . , ξ k , η1, . . . ,ηn−k)

    (ξ 1, . . . , ξ k , ζ 1, . . . , ζ n−k)

    ξ  = (ξ 1, . . . , ξ k)Systematic

    encoder 2

    Informationbits

    Parity-check bits

    Figure 1: Parallel concatenated encoder structure.

    Systematic

    encoder 1

    ξ  = (ξ 1, . . . , ξ k)

    Systematic

    encoder 1

    Permu-tation

    Systematic encoder 2

    Figure 2: Particular realization of the second encoder by using thefirst encoder with an interleaver.

    be handled by direct extension (see, e.g., [24] for a particu-larly clear treatment) or by mapping the m-ary constellationback to its binary origins.

    To begin, a binary (0 or 1) information block   ξ   =(ξ 1, . . . , ξ k) is passed through two constituent encoders, as inFigure 1, to create two codewords:

    ξ 1, . . . , ξ k,η1, . . . ,ηn−k

    ,

    ξ 1, . . . , ξ k, ζ 1, . . . ,ζ n−k

    .   (1)

    Both encoders are systematic and of rate k/n, so that the in-formation bits ξ 1, . . . , ξ k are directly available in either code-word. Note also that the two encoders need not share a com-mon rate, although we will adhere to this case for ease of no-tation.

    In practice, an expedient method of realizing the secondsystematic encoder is to permute (or interleave) the infor-mation bits ξ i  and duplicate the first encoder, as in Figure 2.Since this is a particular instance of  Figure 1, we will simply consider two separate encodings of  ξ  =   (ξ 1, . . . , ξ k) in whatfollows and avoid explicit reference to the interleaving op-eration, despite its importance in the study of the distanceproperties of concatenated codes [35].

    The encoder outputs are converted to antipodal signal-ing (±1) and transmitted over a channel containing additivenoise, giving the received signals x i, y i, and z i:

    x i =

    2ξ i − 1

    + bx ,i,   i = 1,2, . . . , k; y i =

    2ηi − 1

    + b y ,i,   i = 1,2, . . . , n−k;

    z i =

    2ζ i − 1

    + bz ,i,   i = 1,2, . . . ,n−k.(2)

    We assume that the noise samples bx ,i, b y ,i, and bz ,i are Gaus-sian and mutually independent, sharing a common vari-ance σ 2. For notational convenience, we arrange the received

  • 8/19/2019 Mutual Information LLR

    15/239

    764 EURASIP Journal on Applied Signal Processing

    signals into the vectors

    x =

    x 1...x k

    ,   y =

     y 1...

     y n−k

    ,   z =

    z 1...

    z n−k

    .   (3)

     2.1. Optimum decodingThe maximum a posteriori decoding rule aims to calculatethe a posteriori probability ratios

    Prξ i = 1|x , y , z

    Prξ i = 0|x , y , z

    ,   i = 1,2, . . . , k, (4)with the decision rule favoring a 1 for the ith bit if this ratiois greater than one, and 0 if the ratio is less than one. By usingBayes’s rule, each ratio can be developed as

    Pr

    ξ i = 1|x , y , z

    Prξ i = 0|x , y , z = ξ :ξ i=1 Pr(ξ |x , y , z)

    ξ :ξ i=0 Pr(ξ |x , y , z)=

    ξ :ξ i=1 p(x , y , z|ξ )Pr(ξ )ξ :ξ i=0 p(x , y , z|ξ )Pr(ξ )

    ,

    (5)

    involving the a priori probability mass function Pr(ξ ) and thelikelihood function p(x , y , z|ξ ), which is evaluated for the re-ceived x , y , and z as a function of the candidate informationbits ξ  =   (ξ 1, . . . , ξ k); the sum in the numerator (resp., de-nominator) is over all the configurations of the vector ξ   forwhich the ith bit is a “1” (resp., “0”). Since the noise samplesare assumed independent, the likelihood function naturally factors as

     p(x , y , z|ξ ) = p(x |ξ ) p( y |ξ ) p(z|ξ ).   (6)For the Gaussian noise case considered here, the three likeli-hood evaluations appear as

     p(x |ξ ) ∼ exp−x − cx (ξ )2

    2σ 2

    , p( y |ξ ) ∼ exp

    − y − c y (ξ )22σ 2

    , p(z|ξ ) ∼ exp

    z− cz (ξ )

    2

    2σ 2

    ,

    (7)

    where cx (ξ ), c y (ξ ), and cz (ξ ) contain the antipodal symbols±1 which would be received as a function of the candidate in-formation bits ξ , in the absence of noise. For non-Gaussiannoise, the likelihood functions would, of course, assume dif-ferent forms.

    The a posteriori probability ratios may therefore be writ-ten as

    Prξ i=1|x , y , z

    Prξ i=0|x , y , z

    =ξ :ξ i=1 p(x |ξ ) p( y |ξ ) p(z|ξ )Pr(ξ )ξ :ξ i=0 p(x |ξ ) p( y |ξ ) p(z|ξ )Pr(ξ )

    ,

    i = 1,2, . . . , k.(8)

    If the a priori probability mass function Pr(ξ ) is uniform (i.e.,Pr(ξ ) =  1 / 2k for all ξ ), then this reduces to the maximum-likelihood decision metric:

    Prξ i = 1|x , y , z

    Pr

    ξ i = 0|x , y , z

    = ξ :ξ i=1 p(x |ξ ) p( y |ξ ) p(z|ξ )ξ :ξ i=0 p(x |ξ ) p( y |ξ ) p(z|ξ )

      if Pr(ξ ) is uniform.(9)

    If this expression were evaluated as written, the complexity of an optimum decision rule would be O(2k), since there are 2k

    configurations of the k  information bits comprising ξ , lead-ing to as many likelihood function evaluations. This clearly becomes impractical for sizable k.

    Observe now that if we instead consider an optimum de-coding rule using only one of the constituent encoders, wemay write, by a development parallel to that above,

    Pr ξ i = 1|x , y Pr ξ i = 0|x , y  = ξ :ξ i=1 p(x |ξ ) p( y |ξ )Pr(ξ )ξ :ξ i=0 p(x |ξ ) p( y |ξ )Pr(ξ ) , (10)Pr(ξ i = 1|x , z)Pr(ξ i = 0|x , z) =

    ξ :ξ i=1 p(x |ξ ) p(z|ξ )Pr(ξ )ξ :ξ i=0 p(x |ξ ) p(z|ξ )Pr(ξ )

    .   (11)

    If each constituent encoder implements a trellis code, then x and y  form a Markov chain, as do x  and z; the complexity of either decoding expression can then be reduced to  O(k) by using the forward-backward algorithm from [36] (which, inturn, is a particular case of the sum-product algorithm [27]).

    If the a priori probability function Pr(ξ ) is indeed uni-form, then it weighs all terms in the numerator and de-nominator equally and, as such, is eff ectively relegated to anunused variable in either decoding expression (10) or (11).Rather than accepting this status, one can imagine replacingthe a priori probability function Pr(ξ ), or “usurping” its po-sition, by some other function in an attempt to “bias” eitherdecoding rule (10) or (11) towards the maximum-likelihooddecoding rule in (9). In particular, if Pr(ξ ) were replaced by  p(z|ξ ) in (10), or by  p( y |ξ ) in (11), then either expressionwould agree formally with (9).

    In order to retain the  O(k) complexity of the forward-backward algorithm from [36], however, the a priori proba-bility function Pr(ξ ) is assumed to factor into the product of its bitwise marginals:

    Pr(ξ ) = Pr ξ 1Prξ 2 · · ·Prξ k.   (12)The likelihood function   p( y |ξ ) or   p(z|ξ ) does not, on theother hand, generally factor into its bitwise marginals, thatis,

     p( y |ξ ) = p y |ξ 1 p y |ξ 2 · · · p y |ξ k.   (13)As such, a direct usurpation of the a priori probability by thelikelihood function of the parity-check bits of the other con-stituent coder is not feasible. Rather, one must approximate

  • 8/19/2019 Mutual Information LLR

    16/239

    Iterative Decoding of Concatenated Codes: A Tutorial 765

    the likelihood function  p( y |ξ ) or  p(z|ξ ) by a function thatdoes factor into the product of its marginals. Many candidateapproximations may be envisaged; that which has proved themost useful relies on extrinsic information values, which arereviewed next.

     2.2. Extrinsic information valuesWe reexamine the likelihood function for the systematic bits:

     p(x |ξ ) =   1√ 2πσ 

    k  exp−

    ki=1

    x i −

    2ξ i − 1

    22σ 2

    =k

    i=1

    exp− x i − 2ξ i − 12 / 2σ 2√ 

    2πσ 

    = px 1|ξ 1 px 2|ξ 2 · · · px k|ξ k.(14)

    This shows that the likelihood function  p(x |ξ ) for the sys-tematic bits factors into the product of its marginals,1  justlike the a priori probability mass function:

    Pr(ξ ) = Prξ 1Prξ 2 · · ·Prξ k.   (15)Owing to these factorizations, each term from the numer-ator of (10) contains a factor  p(x i|ξ i =  1)Pr(ξ i =  1), andeach term from the denominator contains a factor  p(x i|ξ i =0)Pr(ξ i  =  0). By isolating these common factors, we may rewrite the ratio from (10) as

    Prξ i = 1|x , y 

    Prξ i = 0|x , y 

    =  px i|ξ i = 1 px i|ξ i = 0   

    intrinsicinformation

    Prξ i = 1Prξ i = 0   a priori

    information

    ×

    ξ :ξ i=1 p( y |ξ )

     j=i px  j|ξ  j

    Prξ  j

    ξ :ξ i=0 p( y |ξ )

     j=i px  j|ξ  j) Pr

    ξ  j   

    extrinsic information

    .

    (16)

    The three terms on the right-hand side may be interpreted asfollows:

    (i) the first term indicates what the ith received bit x i con-tributes to the determination of the ith transmitted bitξ i; hence the name “intrinsic information.” It coincides

    with the maximum-likelihood metric for determiningthe ith bit when no coding is used,

    (ii) the second term expresses the a priori probability ratiofor the ith bit, and will be usurped shortly,

    (iii) the third term expresses what the remaining bits in thepacket (i.e., of index   j = i) contribute to the determi-nation of the ith bit; hence the name “extrinsic infor-mation.”

    1Although we show this factorization here for a Gaussian channel, thefactorization holds, of course, for any memoryless channel model.

    Let T (ξ ) = T 1(ξ 1)T 2(ξ 2) · · ·T k(ξ k) be a factorable prob-ability mass function whose bitwise ratios are chosen tomatch the extrinsic information values above:

    T iξ i = 1

    T i

    ξ i = 0

    =

    ξ :ξ i=1 p( y |ξ ) j=i px  j|ξ  jPr ξ  jξ :ξ i=0 p( y |ξ )

     j=i p

    x  j|ξ  j

    Prξ  j ,   i = 1,2, . . . , k.

    (17)

    Since these values depend on the likelihood function  p( y |ξ )(in addition to the systematic bits save for   x i), we may consider T (ξ ) a factorable function which approximates, insome sense, the likelihood function  p( y |ξ ). (We will see inTheorem 2 a condition under which this approximation be-comes exact). We now let  T (ξ ) usurp the place reserved forthe a priori probability function Pr(ξ ) (denoted Pr(ξ )  ←T (ξ )) in the evaluation of the second decoder (11); since both

     p(x |ξ ) and T (ξ ) factor into the product of their respectivemarginals, we have

    ξ :ξ i=1 p(x |ξ ) p(z|ξ )Pr(ξ )ξ :ξ i=0 p(x |ξ ) p(z|ξ )Pr(ξ )

    ←−

    ξ :ξ i=1 p(x |ξ ) p(z|ξ )T (ξ )ξ :ξ i=0 p(x |ξ ) p(z|ξ )T (ξ )

    =   px i|ξ i = 1

     px i|ξ i = 0

       intrinsicinformation

    T iξ i = 1

    T iξ i = 0

       pseudoprior

    ×

    ξ :ξ i=1 p(z|ξ )

     j=i px  j|ξ  j

    T  jξ  j

    ξ :ξ i=0 p(z|ξ )

     j=i px  j|ξ  j

    T  jξ  j   

    extrinsic information

    .

    (18)

    Here we adopt the term “pseudoprior” for  T (ξ ) since itusurps the a priori probability function; similarly, the re-sult of this substitution may be termed a “pseudoposterior”which usurps the true a posteriori probability ratio.

    Let now  U (ξ ) = U 1(ξ 1)U 2(ξ 2) · · ·U k(ξ k) denote anotherfactorable probability function whose bitwise ratios matchthe extrinsic information values furnished by this second de-

    coder:

    U iξ i = 1

    U iξ i = 0

    =

    ξ :ξ i=1 p(z|ξ )

     j=i px  j|ξ  j

    T  jξ  j

    ξ :ξ i=0 p(z|ξ )

     j=i px  j|ξ  j

    T  jξ  j ,   i = 1,2, . . . , k.

    (19)

    This function may then usurp the a priori probability valuesused in the first decoder, and the process iterates. If we leta superscript (m) denote an iteration index, the coupling of 

  • 8/19/2019 Mutual Information LLR

    17/239

    766 EURASIP Journal on Applied Signal Processing

    Systematic

    Parity-check extrinsic

    Pseudoprior

     p(x |ξ ) p( y |ξ )

    U (m)

    U (m+1)

    D   T (m)

    1st decoder

    2nd decoder

    Pseudoprior

    Parity-check extrinsic

    Systematic

     p(z|ξ ) p(x |ξ )

    Figure  3: Flow graph of the turbo decoding algorithm.

    the two decoders admits an external description as

     px i|ξ i = 1

     px i|ξ i = 0

     U (m)i   (1)U (m)i   (0)

    T (m)i   (1)

    T (m)i   (0)

    = ξ :ξ i=1 p( y |ξ ) p(x |ξ )U (m)(ξ )ξ :ξ i=0 p( y |ξ ) p(x |ξ )U (m)(ξ ) ,

    (20)

     px i|ξ i = 1

     px i|ξ i = 0

     T (m)i   (1)T (m)i   (0)

    U (m+1)i   (1)

    U (m+1)i   (0)

    =

    ξ :ξ i=1 p(z|ξ ) p(x |ξ )T (m)(ξ )ξ :ξ i=0 p(z|ξ ) p(x |ξ )T (m)(ξ )

    ,

    (21)

    inwhich(20) furnishesT (m)(ξ ) a nd (21) furnishesU (m+1)(ξ ).This is depicted in  Figure 3. A fixed point corresponds toU (m+1)(ξ ) = U (m)(ξ ) which, by inspection of the pseudopos-teriors above, yields the following property.

    Property  1. A fixed point is attained if and only if the two de-coders yields the same pseudoposteriors (the left-hand sidesof (20) and (21)) for i = 1,2, . . . , k.

    A fixed point is therefore reflected by a state of “consen-sus” between the two decoders [15, 29, 37].

     2.3. Existence of fixed points

    A necessary (but not sufficient) condition for the algorithmto converge is that a fixed point exist, reflected by a state of consensus according to Property 1. A convenient tool in thisdirection is the Brouwer fixed point theorem [38], which as-

    serts that any continuous map from a closed, bounded, andconvex set into itself admits a fixed point; its application inthe present context gives the following result [18, 29].

    Theorem 1.  The turbo decoding algorithm from (20) and  (21)always admits a fixed point.

    To verify, consider the pseudopriors U (m)(ξ i) evaluatedfor ξ i = 1, which, at any iteration m, are (pseudo-) probabil-ities lying between 0 and 1:

    0 ≤ U (m)(1) ≤ 1,   i = 1,2, . . . , k.   (22)

    This clearly gives a closed, bounded, and convex set. Since theupdated pseudopriors U (m+1) also lie in this set, and sincethe map from U (m)(ξ ) to U (m+1)(ξ ) is continuous [18, 29],the conditions of the Brouwer theorem are satisfied, to show existence of a fixed point.

    3. PROJECTIONS AND PRODUCT DISTRIBUTIONS

    A key element of the development thus far concerns the cal-culation of bitwise marginal ratios which, according to [20],provide the troublesome element which accounts for the dif-ference between a provably convergent algorithm [20] whichis not practically implementable, and the implementable—but difficult to grasp—turbo decoding algorithm. We de-velop here an alternate viewpoint of the calculation of bitwisemarginals in terms of a certain projection operator, adaptedfrom the seminal work of Richardson [29].

    Let q(ξ ) be a distribution, for example, a probability massfunction, or a likelihood function, which assigns a nonnega-tive number to each of the 2k evaluations of  ξ 

     = (ξ 1, . . . , ξ k).

    We let q be the vector built from these 2k evaluations:

    q =

    qξ  = (0, . . . ,0,0)

    qξ  = (0, . . . ,0,1)

    ...qξ  = (1, . . . ,1,1)

    2

    k evaluations.   (23)

    We assume that q  is scaled such that its entries sum to one.The   k  marginal distributions determined from  q(ξ ), eachhaving two evaluations at  ξ i =   0 and  ξ i =   1 (1 ≤   i ≤   k),are given by 

    q1ξ 1 = 0 = ξ :ξ 1=0 q(ξ ),   q1ξ 1 = 1 = ξ :ξ 1=1 q(ξ ),q2ξ 2 = 0

    = ξ :ξ 2=0

    q(ξ ),   q2ξ 2 = 1

    = ξ :ξ 2=1

    q(ξ ),

    ......

    qkξ k = 0

    = ξ :ξ k=0

    q(ξ ),   qkξ k = 1

    = ξ :ξ k=1

    q(ξ ).

    (24)

    Definition 1. The distribution q(ξ ) is a product distribution if it coincides with the product of its marginals:

    q(ξ )

    =q1ξ 1q2ξ 2 · · ·qkξ k.   (25)

    The set of all product distributions is denoted by P .

    It is straightforward to check that q(ξ ) ∈ P  if and only if its vector representation is Kronecker decomposable as

    q = q1 ⊗ q2 ⊗ · · ·⊗ qk   (26)

    with

    qi =qiξ i = 0

    qiξ i = 1

    ,   i = 1,2, . . . , k.   (27)

  • 8/19/2019 Mutual Information LLR

    18/239

    Iterative Decoding of Concatenated Codes: A Tutorial 767

    We note also that  P  is closed under multiplication: if  q(ξ )and r (ξ ) belong to P , so does their product:

    s(ξ ) = αq(ξ )r (ξ ) ∈ P , (28)

    where the scalar α  is chosen so that the evaluations of  s(ξ )sum to one. This operation can be expressed in vector nota-tion using the Hadamard (or term-by-term) product:

    s = αq r.   (29)

    To simplify notations, the scalar α will not be explicitly indi-cated, with the tacit understanding that the elements of thevector must be scaled to sum to one; we will henceforth writes = q r, omitting explicit mention of the scale factor α.

    Suppose now   r (ξ ) is not a product distribution. If r 1(ξ 1), . . . , r k(ξ k) denote its marginal distributions, then wecan set

    q(ξ ) = r 1ξ 1

    r 2ξ 2

    · · · r kξ k

    , (30)

    to create a product distribution  q(ξ ) ∈   P  which, by con-struction, generates the same marginals as r (ξ ):

    qiξ i = r iξ i,   i = 1,2, . . . , k.   (31)

    This operation will be denoted by 

    q = π (r).   (32)

    We can observe that q  is a product distribution (q ∈  P ) if and only if π (q) = q, and since π (r) ∈ P   for any distributionr, we must have π (π (r)) = π (r), so that π (·) is a projectionoperator.

    Definition 2. The distribution q is the projection of  r into P if (i) q ∈ P  and (ii) qi(ξ i) = r i(ξ i).

    The following section details some simple information-theoretic properties which reinforce the interpretation as aprojection.

    3.1. Information-theoretic properties of the projector 

    The results summarized in this section may be understood asconcrete transcriptions of ultimately deeper results from thefield of information geometry [33, 34]. To begin, we recallthat the entropy of a distribution r (ξ ) is defined as [39]

    H (r ) = −ξ 

    r (ξ )log2 r (ξ ), (33)

    involving the sum over all 2k configurations of the vector ξ  =(ξ 1, . . . , ξ k). A basic result of information theory asserts thatthe entropy of any joint distribution is upper bounded by thesum of the entropies of its marginal distributions [39], thatis,

    H (r ) ≤k

    i=1H r i = k

    i=1

    1ξ i=0

    r iξ i

    log2r iξ i

    , (34)

    with equality if and only if  r (ξ ) factors into the product of itsmarginals [r (ξ ) ∈  P ]. Therefore, if  r ∈  P , then by settingq = π (r), we have

    H (r ) ≤k

    i=1H 

    r i

    =

    ki=1

    qi

    = H (q), (35)

    because q i(ξ i) =   r i(ξ i) and  q(ξ ) ∈   P . This shows that theprojection q = π (r) maximizes the entropy over all distribu-tions that generate the same marginals as r (ξ ).

    We recall next that the Kullback-Leibler distance (or rela-tive entropy) between two distributions r (ξ ) and s(ξ ) is givenby [20, 39]

    D(r s) =ξ 

    r (ξ )log2r (ξ )s(ξ )

     ≥ 0, (36)

    with D(r s) = 0ifandonlyif r (ξ ) = s(ξ )forall ξ . If s(ξ ) ∈ P and q = π (r), then we may verify (see the appendix) that

    D(r s) = D(r q) + D(qs) ≥ D(r q), (37)

    since D(qs) ≥  0, with equality if and only if  s(ξ ) =  q(ξ ).This shows that the projection q(ξ ) is the closest product dis-tribution to r (ξ ) using the Kullback-Leibler distance.

    3.2. Application to turbo decoding

    The added complication of accounting for the calculation of bitwise marginals noted in [20] can be off set by appealingto the previous section, which interprets bitwise marginalsas resulting from a projection. Accordingly, we show in this

    section how the turbo decoding algorithm of (20) and (21)falls out as an alternating projection algorithm [29].

    Let px , p y , and pz  denote the vectors which collect the 2k

    evaluations of the likelihood functions   p(x |ξ ),  p( y |ξ ), and p(z|ξ ), respectively, that is,

    px  =

     p

    x |ξ  = [0, . . . ,0,0] p

    x |ξ  = [0, . . . ,0,1]...

     p

    x |ξ  = [1, . . . ,1,1]

    2

    k evaluations, (38)

    and similarly for p y  and  pz . Likewise, let the vectors t(m) andu(m) collect the 2k evaluations of T (m)(ξ ) and U (m)(ξ ), respec-tively, at a given iteration m.

    We can observe that the right-hand side of (20) cal-culates the bitwise marginal ratios of the distribution p( y |ξ ) p(x |ξ )U (m)(ξ ); this distribution admits a vector rep-resentation of the form p y  px  u(m). The left-hand side of (20) displays the bitwise marginal ratios of the product dis-tribution px   u(m) t(m) which generates, by construction,the same bitwise marginals as p y   px  u(m). This confirmsthat  px   u(m) t(m) is the projection of  p y   px   u(m) inP . By applying the same reasoning to (21), we establish thefollowing [29].

  • 8/19/2019 Mutual Information LLR

    19/239

    768 EURASIP Journal on Applied Signal Processing

    Proposition 1.   The turbo decoding algorithm of   (20) and  (21)admits an exact description as the alternating projection algo-rithm

    px  u(m) t(m) = π 

    p y  px  u(m)

    , (39)

    px 

    u(m+1)

    t(m)

    =π pz  px  t(m).   (40)

    From this, a connection with maximum-likelihood de-coding follows readily [18].

    Theorem 2.   If  px p y  and/or  px pz  is a product distribution,then

    (1)  the turbo decoding algorithm ( (39) and  (40)) convergesin a single iteration;

    (2)   the pseudoposteriors so obtained agree with the maxi-mum-likelihood decision rule for the code.

    For the proof, assume that px p y  ∈ P . We already haveu(m)

    ∈P , and since P  is closed under multiplication, we see

    that p y   px   u(m) ∈  P . Since the projector behaves as theidentity operation for distributions in  P , the first decoderstep of the turbo decoding algorithm from (39) becomes

    px  u(m) t(m) = π 

    p y  px  u(m) = p y  px  u(m).

    (41)

    From this, we identify  px   t(m) =  px   p y   for all iterationsm to show that a fixed point is attained. The second decoderfrom (40) then gives

    π pz  px  t(m)

    = π pz  px  p y , (42)which furnishes the bitwise marginal ratios of  p(x |ξ ) p( y |ξ ) p(z|ξ ). This agrees with the maximum-likelihood decision rule seen previously in (9). The proof when instead px  pz  ∈ P  follows by exchanging the role of the two decoders.

    Note that since  px  is already a product distribution (i.e.,px  ∈  P ), it is sufficient (but not necessary) that  p y  ∈  P   tohave px   p y  ∈  P . One may anticipate from this result thatif  px  p y  and/or px  pz  is “close” to a product distribution,then the algorithm should converge “rapidly;” formal stepsconfirming this notion are developed in [18]. Such proxim-ity to a product distribution can be verified, in particular, inextreme signal-to-noise ratios [18].

    Example 1 (high signal-to-noise ratios). Let ξ ∗  denote thevector of true information bits. The joint likelihood evalu-ation for x  and y  becomes

     p(x , y |ξ ) ∼ exp−cx ξ ∗− cx (ξ ) + bx 2

    2σ 2

    −c y ξ ∗− c y (ξ ) + b y 2

    2σ 2

    ,

    (43)

    where cx (ξ ) and c y (ξ ) denote the antipodal (±1) representa-tion of the coded information bits ξ , and where bx  and b y  arethe vectors of channel noise samples. As the noise variance σ 2

    tends to zero, we have bx , b y  → 0, and

     p(x , y |ξ )  σ 2→0

    −−−−→ δ ξ − ξ ∗ = 1,   ξ  = ξ ∗,0,   ξ  = ξ ∗.   (44)

    We note that the delta function can always be written as theproduct of its marginals (which are themselves delta func-tions of the individual bits of   ξ ∗). Experimental evidenceconfirms that, in high signal-to-noise ratios, the algorithmconverges rapidly to decoded symbols of high reliability.

    Example 2 (poor signal-to-noise ratios). As the noise vari-ance σ 2 increases, the likelihood evaluations are dominatedby the presence of the noise terms; ratios of candidate likeli-hood evaluations then tend to 1, which is to say that p(x , y |ξ )approaches a uniform distribution:

     p(x , y |ξ )   σ 2→∞−−−−→   12k  ∀ξ.   (45)

    We note that a uniform distribution can always be written asthe product of its marginals (which are themselves uniformdistributions). Experimental evidence again confirms (e.g.,[15, 18]) that, in poor signal-to-noise ratios, the algorithmconverges rapidly to a fixed point, but off ers low confidencein the decoded symbols.

    Although the above examples assume a Gaussian channelfor simplicity, the basic reasoning can be extended to other

    memoryless channel models. More interesting, of course, isthe convergence behavior for intermediate signal-to-noiseratios, which still presents a challenging problem. A naturalquestion at this stage, however, is whether there exist con-stituent encoders which would give  px   p y  or px   pz  as aproduct distribution irrespective of the signal-to-noise ratio.The answer is in the affirmative by considering, for example,a repetition code for the second constituent encoder. The ar-guments showing that  px  ∈  P  can then be copied to show that pz  ∈  P  as well (and therefore that  px   p y  ∈  P ). Butthe distance properties of the resulting concatenated code arenot very impressive, being basically the same as for the firstconstituent encoder. This concurs with an observation from

    [24], namely that “easily decodable” codes do not tend to begood codes.

    4. SERIAL CONCATENATED CODES

    We turn our attention now to serial concatenated codes,which have been studied extensively by Benedetto and hiscoworkers [2,   3,   35], and which encompass an ultimately richer structure. Our aim in this section is to show that thealternating projection interpretation again carries through,aff ording thus a unified study of serial and parallel concate-nated codes.

  • 8/19/2019 Mutual Information LLR

    20/239

    Iterative Decoding of Concatenated Codes: A Tutorial 769

    Outerencoder

      Interleaver  Inner

    encoder

    (ξ 1, . . . , ξ k   )ξ 

    ( χ 1, . . . , χ k , χ k+1, . . . , χ n)    χ 

    (ψ 1, . . . ,ψ l    )ψ 

    Figure  4: Flow graph for a serial concatenated code, with optional

    interleaver.

    The basic flow graph for serial concatenated codes isdepicted in  Figure 4   in which the information bits   ξ   =(ξ 1, . . . , ξ k) are first processed by an outer encoder, whichhere is systematic, so that the first k  bits of its output  χ  =( χ 1, . . . , χ n) are the information bits:

     χ i = ξ i,   i = 1,2, . . . , k.   (46)

    The remaining bits χ k+1, . . . , χ n furnish the n−k parity-check bits. The cascaded inner encoder may admit diff erent inter-

    pretations.(i) The inner encoder may be a second (block or convolu-

    tional) encoder, perhaps endowed with an interleaverto off er protection against burst errors, consistent withconventional serial concatenated codes [2, 3]. Each in-put configuration χ  is mapped to an output configura-tion ψ . With reference to Figure 4, the rate of the innerencoder is n/l .

    (ii) The inner encoder may be a diff erential encoder, inorder to endow the receiver with robustness againstphase ambiguity in the received signal. Since a diff er-ential encoder is a particular case of a rate 1 convolu-tional encoder (with l 

     = n or perhaps l 

     = n+1), this

    case is accommodated by the previous case.(iii) The inner encoder may represent the convolutional ef-

    fect induced by a channel whose memory is longerthan the symbol period. In this case, taking into ac-count that the symbols { χ i} will have been convertedto antipodal signaling (±1), the baseband channel out-put appears as

     v i =m

    hm

    2 χ m−i − 1

       ψ i

    +bi, (47)

    where {

    hm}

     denotes the equivalent impulse responseof the baseband model, bi is the additive channel noise,and where v i  may be scalar-valued (for a single-inputsingle-output channel) or vector-valued (for a single-input multiple-output channel).

    Certainly other interpretations may be developed as well;the above list may nonetheless be considered representativeof some common configurations.

    4.1. Optimum decoding

    With v  denoting the noisy received signal ψ  (after conversionto antipodal form, possibly corrupted by intersymbol inter-

    ference), the optimum decoding metric is again based on thea posteriori marginal probability ratios

    Prξ i = 1| v 

    Prξ i = 0| v 

     = ξ :ξ i=1 Pr(ξ | v )ξ :ξ i=0 Pr(ξ | v )

    = ξ :ξ i=1 p( v |ξ )Pr(ξ )ξ :ξ i=0 p( v |ξ )Pr(ξ ) ,   i = 1,2, . . . , k.

    (48)

    If all input configurations are equally probable, we havePr(ξ ) =   1 / 2k and we recover the maximum-likelihood de-coding rule.

    If no interleaver is used between the two coders, then themapping from ξ   to v  is a noisy convolution, allowing a trel-lis structure to perform optimum decoding at a reasonablecomputational cost. In the presence of an interleaver, on theother hand, the convolutional structure between ξ  and  v   iscompromised, such that a direct evaluation of (48) leads to acomputational complexity that grows exponentially with theblock length. Iterative decoding, to be reviewed next, repre-sents an attempt to reduce the decoding complexity to a rea-sonable value.

    4.2. Iterative decoding for serial concatenated codes

    Iterative serial decoding [2] amounts to implementing locally optimum decoders which infer χ  from v , and then ξ  from χ ,and subsequently exchanging information until consensus isreached. Our development emphasizes the external descrip-tions of the local decoding operations in order to better iden-tify the form of consensus that is reached, as well as to justify the seemingly heuristic coupling between the coders by way of connections with maximum-likelihood decoding.

    Consider first the inner decoding rule, which seeks to de-termine the inner encoder’s input χ  =   ( χ 1, . . . , χ n) from thenoisy received signal v :

    Pr χ i = 1| v 

    Pr χ i = 0| v 

     =  χ : χ i=1 Pr( χ | v ) χ : χ i=0 Pr( χ | v )

    =

     χ : χ i=1 p( v | χ )Pr( χ ) χ : χ i=0 p( v | χ )Pr( χ )

    ,   i = 1,2, . . . ,n.(49)

    The inner decoder assumes that the a priori probability massfunction Pr( χ ) factors into the product of its marginals as

    Pr( χ ) = Pr  χ 1Pr χ 2 · · ·Pr  χ n.   (50)This assumption, strictly speaking, is incorrect, because thebits { χ i} are produced by the outer encoder, which imposesdependencies between the bits for error control purposes.The forward-backward algorithm from [36], however, can-not exploit these dependencies without incurring a signif-icant increase in computational complexity. By turning a“blind eye” to this fact, and therefore admitting the fac-torization of Pr( χ ) into the product of its marginals, eachterm from the numerator (resp., denominator) of (49) will

  • 8/19/2019 Mutual Information LLR

    21/239

    770 EURASIP Journal on Applied Signal Processing

    contain a factor Pr( χ i = 1) (resp., Pr( χ i = 0)), which gives

    Pr χ i=1| v 

    Pr χ i=0| v 

    =  Pr

     χ i=1

    Pr χ i=0

     χ : χ i=1 p( v | χ )

     j=i Pr

     χ  j

     χ : χ i=0 p( v | χ ) j=i Pr  χ  j   extrinsic information

    ,   i=1,2, . . . ,n.

    (51)

    We now let  T ( χ )  =   T 1( χ 1) · · ·T n( χ n) denote a factorableprobability mass function whose marginal ratios match theextrinsic information values above:

    T i χ i = 1

    T i χ i = 0

     =  χ : χ i=1 p( v | χ ) j=i Pr  χ  j χ : χ i=0 p( v | χ )

     j=i Pr

     χ  j .   (52)

    The outer decoder would normally aim to determine theinformation bits ξ  based on an estimate (denoted by 

      χ ) of the

    outer encoder’s output, according to the a posteriori proba-bility ratios

    Prξ i = 1| χ 

    Prξ i = 0| χ  =

    ξ :ξ i=1 Pr(ξ | χ )ξ :ξ i=0 Pr(ξ | χ ) =

    ξ :ξ i=1 p( χ |ξ )Pr(ξ )ξ :ξ i=0 p( χ |ξ )Pr(ξ ) .

    (53)

    The estimate  χ , however, is not immediately available. If itwere, then each likelihood function evaluation would appearas

     p( χ |ξ ) ∼ exp− n j=1

     χ  j − 2 χ  j(ξ )− 122σ 2

    , (54)

    assuming a Gaussian channel, in which  χ  j(ξ ) is either 0 or1, depending on  ξ  =   (ξ 1, . . . , ξ k). To each hypothetical bit χ  j , therefore, we associate two evaluations as exp[−( χ  j ±1)2 / (2σ 2)] (corresponding to   χ  j(ξ )  =   0 or 1), which areusurped by the two evaluations of  T  j( χ  j) from (52):

    exp−  χ  j − 12 / 2σ 2

    exp−  χ  j + 12 / 2σ 2 ←−

     T  j(1)

    T  j(0).   (55)

    The forward-backward algorithm [36] may then run, follow-ing this systematic substitution.

    To develop an external description of the decoding algo-rithm which results, we note that this substitution amounts

    to usurping the likelihood function  p( χ |ξ ) by  p( χ |ξ ) ←− n

     j=1T  j χ  j(ξ )

    , (56)

    in which the right-hand side notationally emphasizes thatonly those bit combinations  χ 1, . . . , χ n   that lie in the outercodebook make sense.

    To arrive at a more convenient form, let φ( χ ) denote theindicator function for the outer codebook:

    φ( χ ) =

    1 if  χ  lies in the outer codebook,

    0 otherwise.  (57)

    The 2n configurations of ( χ 1, . . . , χ n) generate 2n evaluationsof n

     j=1 T  j( χ  j), but only 2k of these evaluations survive in

    the product φ( χ )

     j T  j( χ  j), namely, the 2k evaluations from

    the right-hand side of (56) which are generated as ξ   variesover its 2k configurations. We may then establish a one-to-one correspondence between the 2k “surviving” evaluations

    in  φ( χ ) j T  j( χ  j) and the 2k evaluations of the likelihoodfunction   p( χ |ξ ) which are usurped in (56). Assuming thatPr(ξ ) is a uniform distribution, the usurped pseudoposteri-ors from (53) become

    ξ :ξ i=1 p( χ |ξ )ξ :ξ i=0 p( χ |ξ ) ←−

     χ : χ i=1 φ( χ )

    n j=1 T  j

     χ  j

     χ : χ i=0 φ( χ )n

     j=1 T  j χ  j

    =  T i(1)T i(0)

     χ : χ i=1 φ( χ )

     j=i T  j

     χ  j

     χ : χ i=0 φ( χ )

     j=i T  j χ  j   

    extrinsic information

    ,(58)

    in which we note the following:

    (i) since the outer code is systematic, the first   k   bits χ 1, . . . , χ k  coincide with the information bits ξ 1, . . . , ξ k,allowing therefore a direct substitution for the vari-ables of summation. In addition, the formula abovemay be evaluated as written for the parity-check bits χ k+1, . . . , χ n;

    (ii) each term in the numerator (resp., denominator) con-tains a factor T i( χ i = 1) (resp., T i( χ i = 0)), so that theratio T i(1) /T i(0) naturally factors out. Let

    U ( χ ) = U 1 χ 1 · · ·U n χ n   (59)

    be a factorable probability function whose marginalratios match the extrinsic information values:

    U i(1)U i(0)

     =

     χ : χ i=1 φ( χ )

     j=i T  j χ  j

     χ : χ i=0 φ( χ )

     j=i T  j χ  j ,   i = 1,2, . . . ,n.   (60)

    These values may then usurp the a priori probability function Pr( χ ) of the inner decoder: Pr( χ ) ← U ( χ ).

    If we let a superscript (m) denote an iteration number,then the coupling of the two decoders admits an external de-scription of the form

    T (m)i   (1)

    T (m)i   (0)U 

    (m)i   (1)

    U (m)i   (0)

    =

     χ : χ i=1 p( v | χ )n

     j=1 U (m) j   ( χ  j)

     χ : χ i=0 p( v | χ )n

     j=1 U (m) j

     χ  j ,   i = 1,2, . . . ,n,

    (61)

    T (m)i   (1)

    T (m)i   (0)

    U (m+1)i   (1)

    U (m+1)i   (0)

    =

     χ : χ i=1 φ( χ )n

     j=1 T (m) j

     χ  j

     χ : χ i=0 φ( χ )

    n j=1 T 

    (m) j

     χ  j

    ,   i = 1,2, . . . ,n,

    (62)

  • 8/19/2019 Mutual Information LLR

    22/239

    Iterative Decoding of Concatenated Codes: A Tutorial 771

    {U (m+1) j   ( χ  j )}{U (m) j   ( χ  j)}

    D

     v    Innerdecoder

    {T (m)

     j   ( χ  j)} Outerdecoder

    Pseudoposteriors

    Figure  5: Flow graph for iterative decoding of serial concatenatedcodes.

    as depicted in   Figure 5. A fixed point corresponds toU (m+1)( χ ) = U (m)( χ ) which, in analogy with the parallel con-catenated code case, can be characterized as the following“consensus” property.

    Property   2. A fixed point in the serial decoding algorithmoccurs if and only if the two decoders yield the same pseu-doposteriors (left-hand sides of (61) and (62)) for   i   =1,2,

    . . .,n

    .

    Note that the consensus here covers the information bitsplus the parity-check bits furnished by the outer decoder.As with the parallel concatenated code case, the existence of fixed points follows by applying the Brouwer fixed point the-orem (cf. Section 2.3).

    4.3. Projection interpretation

    The iterative decoding algorithm for serial concatenatedcodes can also be rephrased as an alternating projection al-gorithm, analogously to the parallel concatenated code caseof  Section 3, as we develop presently.

    We continue to denote by P  the set of distributions q(ξ )which factor into the product of their marginals:

    q( χ ) = q1 χ 1q2 χ 2 · · · qn χ n.   (63)

    The only modification here is that we now have n  marginaldistributions to consider, to account for the k   informationbits plus the  n−k  parity-check bits which intervene in theconsensus of  Property 2. If  r ( χ ) is an arbitrary distribution,then q = π (r) yields a distribution q( χ ) ∈ P  which generatesthe same n marginal distributions as r ( χ ).

    We let pv  denote the vector containing the 2

    n

    likelihoodevaluations of  p( v | χ ):

    pv  =

     p v | χ = (0, . . . ,0,0)

     p v | χ = (0, . . . ,0,1)

    ...

     p v | χ = (1, . . . ,1,1)

    2n evaluations.   (64)

    Similarly, let the vectors t(m), u(m), and φ collect their respec-

    tive 2n evaluations:

    t(m) =

    T (m)1   (0) · · ·T (m)n   (0)T (m)1   (0) · · ·T (m)n   (1)

    ...

    (m)

    1   (1) · · ·T (m)

    n   (1)

    ,

    u(m) =

    U (m)1   (0) · · ·U (m)n   (0)U (m)1   (0) · · ·U (m)n   (1)

    ...

    U (m)1   (1) · · ·U (m)n   (1)

    ,

    φ =

    φ χ = (0, . . . ,0,0)

    φ χ = (0, . . . ,0,1)

    ...φ

     χ = (1, . . . ,1,1)

    .

    (65)

    With respect to the inner decoder, we see that the right-hand side of (61) calculates the marginal ratios of the dis-tribution p( v | χ )U (m)( χ ), which distribution admits a vectorrepresentation as pv   u(m). The left-hand side of (61) con-tains the marginal ratios of  t(m)u(m) ∈ P , which agree withthose of  pv  u(m), consistent with our projection operation.By applying the same reasoning to (62), we obtain a naturalcounterpart to Proposition 1.

    Proposition 2.  The iterative serial decoding algorithm of   (61)and  (62) coincides with the alternating projection algorithm

    t(m)

    u(m)

    =π pv  u(m),

    t(m) u(m+1) = π φ t(m). (66)From this follows a natural analogue to Theorem 2 estab-

    lishing a key link with maximum-likelihood decoding.

    Theorem 3.   If  p( v | χ ) factors into the product of its marginals,then

    (1)  the iterative algorithm (61) and  (62) converges in a sin- gle iteration;

    (2)   the pseudoposteriors so obtained agree with the maxi-mum-likelihood decision metric for the code.

    The proof parallels that of  Theorem 2, but displays itsown particularities which merit its inclusion here. If  p( v | χ )factors into the product of its marginals, then  pv  ∈  P , giv-ing pv  u(m) ∈ P  as well. Since the projector behaves as theidentity when applied to elements of  P , the first displayedequation of  Proposition 2 becomes

    t(m) u(m) = π pv  u(m) = pv  u(m).   (67)From this we identify  t(m) = pv  for all iterations m, giving afixed point. Substituting  t(m) =  pv  into the projector of the

  • 8/19/2019 Mutual Information LLR

    23/239

    772 EURASIP Journal on Applied Signal Processing

    second displayed equation of  Proposition 2 reveals

    t(m) u(m+1) = π φ t(m) = π φ pv .   (68)This calculates the marginal functions of  φ( χ ) p( v | χ ), whosesurviving evaluations are the restriction of the likelihoodfunction p( v | χ ) to the outer codebook:

    φ( χ ) p( v | χ ) = p v | χ (ξ ) = p( v |ξ ) if  φ( χ ) = 1,

    0 otherwise.  (69)

    Since the outer code is systematic, we have   χ i   =   ξ i   fori   =   1, . . . , k. Therefore, the first   k   marginal ratios fromφ( χ ) p( v | χ ) coincide with those from   p( v |ξ ); these in turnagree with the maximum-likelihood decoding rule which re-sults from (48) when the a priori probability function Pr(ξ )is uniform.

    As with the case of parallel concatenated codes, the like-lihood function  p( v | χ ) will be “close” to a factorable distri-bution when the signal-to-noise ratio is sufficiently high or

    suffi

    ciently low. The conclusions from [18, Section 3, Exam-ples 1 and 2] therefore apply to serial concatenated codes aswell.

    5. CONCLUDING REMARKS

    We have developed a tutorial overview of iterative decod-ing for parallel and serial concatenated codes, in the hopesof rendering this material accessible to a wider audience.Our development has emphasized descriptions and proper-ties which are valid irrespective of the block length, whichmay facilitate the analysis of such algorithms for short block lengths. At the same time, the presentation emphasizes how decoding algorithms for parallel and serial concatenated

    codes may be addressed in a unified manner.Although diff erent properties have been exposed, the

    critical question of convergence domains versus code choiceand signal-to-noise ratio remains less immediate to develop.The natural extension of the projection viewpoint favoredhere involves studying the stability properties of the dynamicsystem which results. This is pursued in [18, 29] (among oth-ers) in which explicit expressions for the Jacobian of the sys-tem feedback matrix are obtained; once a fixed point is iso-lated, local stability properties can then be studied [18, 29],but they depend in a complicated manner on the specific

    code and channel properties (distance, block length, signal-to-noise ratio, etc.).

    One may observe that a fixed point occurs wheneverthe pseudoposteriors assume uniform distributions, and thatthis gives a convergent point in pessimistic signal-to-noiseratios [18]. With some further code constraints [40], fixed

    points are also shown to occur at codeword configurations(i.e., where T i(1) = ξ i), consistent with the observed conver-gence behavior for signal-to-noise ratios beyond the water-fall region, and corresponding to an unequivocal fixed pointin the terminology of [18]. Interestingly, the convergence of pseudoprobabilities to 0 or 1 was observed for low-density parity-check codes as far back as [6]. Deducing the stability properties of diff erent fixed points versus the signal-to-noiseratio and block length, however, remains a challenging prob-lem.

    By allowing the block length to become arbitrarily long,large sample approximations may be invoked, which typi-cally take the form of log-pseudoprobability ratios approach-

    ing independent Gaussian random variables. Many insight-ful analyses may then be developed (e.g., [15,  16,  17,  19],among others). Such approximations, however, are known tobe less than faithful for shorter block lengths, of greater in-terest in two-way communication systems, and analyses ex-ploiting large sample approximations do not adequately pre-dict the behavior of iterative decoding algorithms for shorterblock lengths.

    Graphical methods (including [25, 26, 27, 28]) provideanother powerful analysis technique in this direction. Presenttrends include studying how code design impacts the cyclelength of the decoding algorithm, based on the plausible con- jecture that longer cycles should have a greater “stability mar-

    gin” in an ultimately closed-loop system. Further study, how-ever, is required to better understand the stability propertiesof iterative decoding algorithms in the general case.

    APPENDIX

    VERIFICATION OF IDENTITY (37)

    Let   r (ξ ) be an arbitrary distribution, and let   q(ξ ) be itsprojection in   P , giving a product distribution   q(ξ )   =q1(ξ 1) · · · qk(ξ k) whose marginals match those of   r (ξ ) :qi(ξ i) = r i(ξ i). Consider first

    D(r q) =1

    ξ 1=0· · ·

    1ξ k=0

    r (ξ )log2r (ξ )

    q1ξ 1 · · · qkξ k

    =1

    ξ 1=0· · ·

    1ξ k=0

    r (ξ )log2 r (ξ )   −H (r )

    −1

    ξ 1=0· · ·

    1ξ k=0

    r (ξ )log2q1ξ 1 · · · qkξ k

    = −H (r ) + 1

    ξ 1=0· · ·

    1ξ k=0

    r (ξ )log2 q1ξ 1

    + · · · +1

    ξ 1=0· · ·

    1ξ k=0

    r (ξ )log2 qkξ k

       (a)

    .(A.1)

  • 8/19/2019 Mutual Information LLR

    24/239

    Iterative Decoding of Concatenated Codes: A Tutorial 773

    The ith sum from the term (a) appears as

    1ξ 1=0

    · · ·1

    ξ i=0· · ·

    1ξ k=0

    r (ξ )log2 qiξ i

    =

    1

    ξ i=0 log2 qiξ i j=i1

    ξ  j=0 r ξ 1, . . . , ξ k   r i(ξ i)=qi(ξ i)

    =1

    ξ i=0r iξ i

    log2 qiξ i = −H r i = −H qi,

    (A.2)

    since the sums over bits other than i extract the ith marginalfunction r i(ξ i), which coincides with qi(ξ i). Combining withthe previous expression, we see that

    D(r 

    q)

    =

    k

    i=1 H r i−H (r ).   (A.3)Now let s(ξ ) =   s1(ξ i) · · · sk(ξ k) be an arbitrary product

    distribution. The same steps illustrated above give

    D(r s) =1

    ξ 1=0· · ·

    1ξ k=0

    r (ξ )log2r (ξ )

    s1ξ 1 · · · skξ k

    = −H (r )− 1

    ξ 1=0· · ·

    1ξ k=0

    r (ξ )log2 s1ξ 1

    + · · ·

    +

    1

    ξ 1=0 · · ·1

    ξ k=0 r (ξ )log2 skξ k= −H (r )−

    1ξ 1=0

    r 1ξ 1

    log2 s1ξ 1

    + · · ·

    +1

    ξ k=0r kξ k

    log2 skξ k

    = −H (r )− 1

    ξ 1=0q1ξ 1

    log2 s1ξ 1

    + · · ·

    +

    1ξ n=0

    qkξ k log2 skξ k.(A.4)

    Adding and subtracting the sums

    1ξ 1=0

    q1ξ 1

    log2 q1ξ 1

    + · · · +1

    ξ n=0qkξ k

    log2 qkξ k

    = −k

    i=1H qi

    ,

    (A.5)

    and regrouping gives

    D(r s)

    = −H (r ) +k

    i=1H qi

       D(r q)+

    1ξ 1=0

    q1ξ 1

    log2q1ξ 1

    s1ξ 1 +· · ·+ 1

    ξ k=0qkξ k

    log2qkξ n

    skξ k

       ki=1 D(qisi)=D(qs)

    ,

    (A.6)

    which is the identity (37).

    ACKNOWLEDGMENT

    This work was supported by the Scientific Services Program

    of the US Army, Contract no. DAAD19-02-D-0001.

    REFERENCES

    [1] C. Berrou and A. Glavieux, “Near optimum error correctingcoding and decoding: turbo-codes,”   IEEE Trans. Commun.,vol. 44, no. 10, pp. 1261–1271, 1996.

    [2] S. Benedetto and G. Montorsi, “Iterative decoding of serially concatenated convolutional codes,”  Electronics Letters, vol. 32,no. 13, pp. 1186–1188, 1996.

    [3] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, “Anal- ysis, design, and iterative decoding of double serially concate-nated codes with interleavers,” IEEE J. Select. Areas Commun.,vol. 16, no. 2, pp. 231–244, 1998.

    [4] D. J. C. MacKay, “Good error-correcting codes based on very 

    sparse matrices,”   IEEE Trans. Inform. Theory , vol. 45, no. 2,pp. 399–431, 1999.

    [5] Y. Kou, S. Lin, and M. P. C. Fossorier, “Low-density parity-check codes based on finite geometries: a rediscovery and new results,”  IEEE Trans. Inform. Theory , vol. 47, no. 7, pp. 2711–2736, 2001.

    [6] R. G. Gallager, “Low-density parity-check codes,”  IRE Trans.Inform. Theory , vol. 8, no. 1, pp. 21–28, 1962.

    [7] C. Douillard, M. Jezequel, C. Berrou, P. Picart, P. Didier, andA. Glavieux, “Iterative correction of intersymbol interference:turbo equalization,”   European Transactions on Telecommuni-cations, vol. 6, no. 5, pp. 507–511, 1995.

    [8] C. Laot, A. Glavieux, and J. Labat, “Turbo equalization: adap-tive equalization and channel decoding jointly optimized,”IEEE J. Select. Areas Commun., vol. 19, no. 9, pp. 1744–1752,

    2001.[9] M. Tüchler, R. Kotter, and A. Singer, “Turbo equalization:

    principles and new results,”   IEEE Trans. Commun., vol. 50,no. 5, pp. 754–767, 2002.

    [10] X. Wang and H. V. Poor, “Iterative (turbo) soft interferencecancellation and decoding for coded CDMA,”   IEEE Trans.Commun., vol. 47, no. 7, pp. 1046–1061, 1999.

    [11] X. Wang and H. V. Poor, “Blind joint equalization and mul-tiuser detection for DS-CDMA in unknown correlated noise,”IEEE Trans. Circuits Syst. II , vol. 46, no. 7, pp. 886–895, 1999.

    [12] C. Heegard and S. B. Wicker, Turbo Coding , Kluwer AcademicPublishers, Boston, Mass, USA, 1999.

    [13] B. Vucetic and J. Yuan,  Turbo Codes: Principles and Applica-tions, Kluwer Academic Publishers, Boston, Mass, USA, 2000.

  • 8/19/2019 Mutual Information LLR

    25/239

    774 EURASIP Journal on Applied Signal Processing

    [14] L. Hanzo, T. H. Liew, and B. L. Yeap,   Turbo Coding, TurboEqualisation and Space-Time Coding , John Wiley & Sons,Chichester, UK, 2002.

    [15] S. ten Brink, “Convergence behavior of iteratively decodedparallel concatenated codes,”   IEEE Trans. Commun., vol. 49,no. 10, pp. 1727–1737, 2001.

    [16] T. Richardson and R. Urbanke, “An introduction to the analy-

    sis of iterative coding systems,” in Codes, Systems, and Graphi-cal Models, IMA Volume in Mathematics and Its Applications,pp. 1–37, New York, NY, USA, 2001.

    [17] D. Divsalar, S. Dolinar, and F. Pollara, “Iterative turbo de-coder analysis based on density evolution,”  IEEE J. Select. Ar-eas Commun., vol. 19, no. 5, pp. 891–907, 2001.

    [18] D. Agrawal and A. Vardy, “The turbo decoding algorithm andits phase trajectories,” IEEE Trans. Inform. Theory , vol. 47, no.2, pp. 699–722, 2001.

    [19] H. El Gamal and A. R. Hammons Jr., “Analyzing the turbodecoder using the Gaussian approximation,”  IEEE Trans. In-

     form. Theory , vol. 47, no. 2, pp. 671–686, 2001.

    [20] M. Moher and T. A. Gulliver, “Cross-entropy and iterative de-coding,”   IEEE Trans. Inform. Theory , vol. 44, no. 7, pp. 3097–3104, 1998.

    [21] R. Le Bidan, C. Laot, D. LeRoux, and A. Glavieux, “Anal- yse de la convergence en turbo-détection,” in Proc. ColloqueGRETSI sur le Traitement du Signal et des Images (GRETSI ’01), Toulouse, France, September 2001.

    [22] A. Roumy, A. J. Grant, I. Fijalkow, P. D. Alexander, andD. Pirez, “Turbo-equalization: convergence analysis,” in  Proc.IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’01), vol. 4, pp. 2645–2648, Salt Lake City,Utah, USA, May 2001.

    [23] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networksof Plausible Inference, Morgan Kaufmann Publishers, San Ma-teo, Calif, USA, 1988.

    [24] R. J. McEliece, D. J. C. MacKay, and J.-F. Cheng, “Turbodecoding as an instance of pearl’s “belief propagation” algo-

    rithm,” IEEE J. Select. Areas Commun., vol. 16, no. 2, pp. 140–152, 1998.

    [25] R. M. Tanner, “A recursive approach to low complexity codes,”IEEE Trans. Inform. Theory , vol. 27, no. 5, pp. 533–547, 1981.

    [26] N. Wiberg, Codes and decoding on general graphs, Ph.D. thesis,Link ̈oping University, Link ̈oping, Sweden, April 1996.

    [27] F. R. Kschischang and B. J. Frey, “Iterative decoding of com-pound codes by probability propagation in graphical models,”IEEE J. Select. Areas Commun., vol. 16, no. 2, pp. 219–230,1998.

    [28] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factorgraphs and the sum-product algorithm,” IEEE Trans. Inform.Theory , vol. 47, no. 2, pp. 498–519, 2001.

    [29] T. Richardson, “The geometry of turbo-decoding dynamics,”IEEE Trans. Inform. Theory , vol. 46, no. 1, pp. 9–23, 2000.

    [30] M. Luby, M. Mitzenmacher, A. Shokrollahi, and D. Spielman,“Analysis of low density codes and improved designs using ir-regular graphs,” in   Proc. 30th Annual ACM Symposium onTheory of Computing , pp. 249–258, Dallas, Tex, USA, 1998.

    [31] N. Sourlas, “Spin-glass models as error-correcting codes,” Na-ture, vol. 339, no. 6227, pp. 693–695, 1989.

    [32] A. Montanari and N. Sourlas, “Statistical mechanics andturbo codes,” in Proc. 2nd International Symposium on TurboCodes and Related Topics, pp. 63–66, Brest, France, September2000.

    [33] S. Ikeda, T. Tanaka, and S. Amari, “Information geometry of turbo and low-density parity-check codes,”   IEEE Trans. In-

     form. Theory , vol. 50, no. 6, pp. 1097–1114, 2004.

    [34] S. Amari and H. Nagaoka,  Methods of information geometry ,American Mathematical Society, Providence, RI, USA; OxfordUniversity Press, New York, NY, USA, 2000.

    [35] S. Benedetto and G. Montorsi, “Unveiling turbo codes: someresults on parallel concatenated coding schemes,” IEEE Trans.Inform. Theory , vol. 42, no. 2, pp. 409–428, 1996.

    [36] L. R. Bahl, J. Cocke, F. Jelinek, andJ. Raviv, “Optimal decoding

    of linear codes for minimizing symbol error rate (corresp.),”IEEE Trans. Inform. Theory , vol. 20, no. 2, pp. 284–287, 1974.

    [37] J. Hagenauer, E. Off er, and L. Papke, “Iterative decoding of binary block and convolutional codes,”   IEEE Trans. Inform.Theory , vol. 42, no. 2, pp. 429–445, 1996.

    [38] T. L. Saaty and J. Bram,   Nonlinear Mathematics, McGraw-Hill, New York, NY, USA, 1964.

    [39] T. M. Cover and J.A. Thomas, Elements of Information Theory ,John Wiley & Sons, New York, NY, USA, 1991.

    [40] P. A. Regalia, “Contractivity in turbo iterations,” in  Proc. IEEE International Conference on Acoustics, Speech, and Signal Pro-cessing (ICASSP ’04), vol. 4, pp. 637–640, Montreal, Canada,May 2004.

    Phillip A. Regalia   was born in WalnutCreek, California, in 1962. He received theB.S. (with honors), M.S., and Ph.D. de-grees in electrical and computer engineer-ing in 1985, 1987, and 1988, respectively,from the University of California at SantaBarbara, and the Habilitation  à Diriger desRecherches (necessary to advance to